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A Vector for Gene Trap, and A Method for Gene Trapping 
by Using itie Vector 



CI aims 

1. A vector for trapping an lonknown gene of rtrosc^ahUa 
jnelanoga^ter, which is a reconibinant plasmid coinprising the 
following nucleotide sequences in this order: 

an artificial consensus splicing acceptor site; 

a synthetic '"stop/start" sequence; 



a drug resistance gene; 

a gene responsible for a detectable ph^otype of the 
Drasophlla melanogaster; smd 
a synthetic splicing donor site. 



2. The vector of claim 1, wherein the reconibinant plasmid 
is derived from pCasperS, 



3. The vector of claim 1 or 2, vdierein the reporter gene 
the Gal4 gene. 



4. The vector of claim 3, which has the nucleotide sequence 
of SEQ ID No. 1. 



5, Hie vector of claim 1 or 2, v^erein the reporter gene 
Gal4 DNA. binding domain-P53 fusion gene. 
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6. The veetor of claim 1 or 2, vAierein the reporter gene is 
the Gal4 -firefly luciferase fusion gene. 

7. The vector of any one of claims 1-6, wherein the gene 
responsible for a detectable phenotype of the ItosQphila 
jnelanogaster is mini— ^vdiite gene. 

8. The vector of any one of claims 1-7, vdierein the drug 
resi stance gene is neonycin-phosphotranspherase g^e and its 
promoter is a heatshock promoter , 

9 A vector derived from pCasperhS/ v^ch has the heatshock 
promoter directed Gal4 activator domain-large T antigen fiasion 
gene within polycloning site of the pCasperhs. 

10. A method for tr^:ping an unknown gene of a Drosq phi la 
melanoga^ter by using a vector v*iich is a recccrbinant plasmxd 
ocnprising the following nucleotide sequences in this order: 

an artificial consensus splicing acceptor site; 

a synthetic ^stop/start" sequence; 

a reporter gene; 

a drug resistance gene; 

a gene responsible for a detectable phenotype of the 
Uroscpbxla mBlBnogsuster; and 

a synthetic splicing donor site, 

which method ccirprises the steps of: 

(a) introducing the vector into the genome of a white minus 

fly; 

(b) selecting primary transforraants resistant to a drug; 
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(c) crossing the primary transformants with a transposase 
source strain to force the vector to jump into other locations; 

(d) selecting secondary transformants by picking \sp the 
flies having strong eye color, 

(e) crossing the secondary transforroants with UAS (I4>stream 
Sequence) -luciferase harboring strain and measuring 

the reporter g^e ejgiression of the resultant flies; and 

(f) identifying the trapped gene by cloning and sequencing 
the cDNAs fused to the r^rter gene and the gene responsible 
for a detectable phenotype of the fly. 



11. The method according to claim 10, vflierein the 
reoonbinant plasmid is derived frxan pCasper3- 

12. The method according to claim 10 or 11, v*ierein the 
r^»rter gene in the vector is the Gal4 gene, and in the s 
(e) the Gal4 esqjression is measured. 



13. The method according to claim 10 or 11, wherein the 
r^rter gene of the vector is the Gal4 -firefly luciferase 
fusion gene, and in the step (e) eoqsression of said fusion gene 
is measured without crossing the secondary transformants with 
UAS-lucif erase harboring strain. 



14. The method according to any one of claims 10 to 14, 
wherein the gene responsible for a detectable phenotype of the 
DrosophiJa melanogaster is mini -white gene, and in the st^ (f) 
the cDNAs fused to the reporter gene and the mini -white gene 
are cloned and sequenced. 
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15, The method according to any one of ciaiins 10 to 15, 
wherein the drug resistancse gene is necrtiycin- 
phosphotranspherase gene and its promoter is a heatshock 
promoter, and in the step (b) the transfonnants resistant to 
G418 is selected* 

16. A method for trapping an unkjiown gene of a Droscphxla. 
mel3Lnog^tBr by losing a vector A which is a recombinant plasmid 
ccnprising the follovrLng nucleotide sequences in this order: 

an artificial consensus splicing acceptor site; 
a synthetic ^^stcp/stsort'^ sequence ; 

Gal4 DNA binding dC3raain-P53 fusion gene as a reporter gene; 
a drug resistance gene; 

a gene responsible for a detectable phenotype of the 
Drosophlla. xaelsLnogsLst&o; and 

a synthetic splicing donor site, 
and a vector B derived from pCasperhs, which has the heatshock 
promoter directed Gal4 activator domain-large T antigen fusion 
gene within polycloning site of the pCasperhs, 
v^ch method comprises the steps of: 
(a) introducing each of the vectors A and B into the genomes 
of separate v^te nuLnxos flies; 

Cb) selecting primary transformants for the vector A v^cii 
are resistant to a drug, and selecting primary transformants 
for the vector B >diich have an eye color; 

(c) crossing the primary transformants for the vector A with 
a transposase source strain to force the vector to jurrp into 

other locations; 

(d) selecting secondary transformants for the vector A by 
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picking up the flies having strong eye color; 

(e) crossing the secondary transformants with the primary 
tiransformants for the vector B to obtain flies harix>ring both 
the vectors A and B; 

(f ) crossing the flies obtained in the stje^ (e) vrith an UAS- 
luciferase harboring fly strain and measuring the reporter gene 
esqDression of the resultant flies; and 

(g) idmtifying the tr^:ped g^ie by cloning and sequencing 
the cDNAs fused to the reporter gene and the g^e zrespcxisible 
for a detectable phenotype of the fly. 

17, The method according tx> claim 16, ^Therein the vector A 
is derived from pCa^^erS. 

18, The method according to claim 16 or 17, herein the gene 
responsible for a detectable phenotype of the Droscpiixla 
melanogBstojc is mini-vrtiite gene, cind in the st^ (g) the cDNAs 
fused to the r^x)rter gene and the mini-wtiite gene are cloned 
and sequenced. 

19, The method according to any one of claims 16 to 18, 
v^ierein the drug resistance gene is neomycin- 
phosphotranspherase gene and its prxxnoter is a heatshock 
promoter, and in the step (b) the transformant resistant to 
G418 is selected. 

Deta±led Description of Invention 
Technical Field 

The present invention relates to a new vector system to 
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facilitate the cloning and functional analysis of new genes of 
a fly, Drosophxla melanogastez*, and a method for gene tr^^ping 
with the vector system, 

BadcgroLind Art 

There are numerous exairples for ^^plication of gene 
trapping methods in wide range of living organisms including 
maize and mouse (Gossler et al,. Scienoer, 244:463-465, 1989), 

With respect to tools for gene trapping, the application 
of different types of enhancer trap P-element vectors (Wilson 
et al.. Genes & Develcpment, 3:1301-1313, 1989) for cloning and 
analyzing trapped genes, as well their xjse for mosaic analysis 
with the help of the Gal4/UAS transcription activator system 
has proven fruitful. However, sometimes the e3q>ression pattern 
of the Gal4 or other r^>orter gene of the vector construct is 
affected by enhancers belonging to more than one gene. 
Similarly, in seme cases it is difficult to determine \^ether 
tlie enhancer trap insertion effects the function of one or more 
of the neighboring genes. 

These circumstances altogether with the fact that in 
some cases the mutant phenotype could be attributed to the 
changed expression of a gene with its nearest exon located more 
than 30 kB apart from the insertion site, can lead in 
unfortunate cases to an ordeal when it' s time to clone and 
analyze the affected gene. 

One object of this application is to provide a vector 
systOT includes specifically designed artificial regulatory 
sequences as well as selection methods for easy screening of 
positive recombinant lines. More especially, this application 
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intends to profvide a vector system of this invention offering 
much easier and faster cloning opportunities of the affected 
gene^ oonpared to the widely used enhancer trap P-element 
vectors. Another object of this application is to provide 
easier detection method possibilities of the successful 
trapping events and much higher chance to get more 
characteristic {^functional") expression patterns of the 
reporter gene because in the contrary with much of the cases 
with enhancer trap lines / vdien using the vector system of this 
invention, the reporter gene eaqpression is influenced only by a 
single endogenous transcription unit and effects only the 
e;3q>ression of the very same g^e. 

Disci osiare of Invention 

The first invention of this application is a vector for 
trapping am xjnknown gene of JDrosqphila xaelanagsLstGjrf v^iich is a 
reaatfcinant plasmid ccjrprising the following nucleotide 
sequences in this order: 

an artificial consensus splicing acceptor site; 

a synthetic ^^stop/start" sequenoe; 

a reporter gene; 

a drug resistance gene; 

a gene responsible for a detectable phenotype of the 
Droscptixla welanogastjor; and 

a synthetic splicing donor site. 

One embodiment of the first invention is that the 
reconfcinant plasmid is derived from pCasper3, 

Other embodiments of the first invention are that the 
reporter gene is the Gal4 gene, Gal4 DMA binding dc3roain-P53 
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fusion gene or the Gal4-fireflY lucif erase fusion gene. 

Further CTixxiLment of this first invention is that the 
gene responsible for a detectable phenotype of the Droscphlla, 
melanogsLstier is mini-vdiite gene. 

More further OTbodiment of the first invention is that 
the drug resistance gene is neonr/cin-phosphotranspherase gene 
and its promoter is a heatshock promoter. 

•Rie second invention of this application is a method for 
trapping an unknov/n gene of a Grosqphila inelanogaster fcy using 
a vector v*iich is a reoonibinant plasmid ccxtprising the 
following nucleotide sequences in this order: 

an artificicil consensus splicing acceptor site; 

a synthetic ^stop/start" sequence; 

a reporter gene; 

a drug resistance gene; 

a gene responsible for a ctetectable phenotype of the 
Drosqphi Ja m^lBnagsLSteir; and 

a synthetic splicing donor site, 

v*iich method ccrrprises the steps of: 

(a) introducing the vector into the gencrae of a white minus 

fly; 

(b) selecting primaary transformants resistant to a drug; 

(c) crossing the primary transformants with a transposase 
source strain to force the vector to jijmp into other locations; 

(d) selecting secondary transformants by picking iip the 
flies having strong eye color, 

<e) crossing the secondary transformants with UAS (l^stream 
Activator Sequence) -lucif erase harboring strain and measuring 
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the reporter gene expression of" the resultant flies; and 

{f*> identifying the trapped gene by cloning and sequencing 
the c£)NAs fused to the reporter gene and the gene responsible 
for a detectable phenotype of the fly. 

The third invention of this a¥f>lication is a method for 
trapping an iinknovn gene of a DjrasqphlJLB jneJanogaster by using 
a vector A v^iicii is a reocntoinant plasznid coniprising the 
following nucleotide sequences in this order; 

an artificial oonsenstis splicing acceptor site; 

a synthetic "^^s top/start" sequ^ce; 

Gal4 DNA. binding doniain-P53 fusion gene as a reporter gene; 
a drug xresistance gene; 

a gene responsible for a detectable phenotype of the 
Drascphilsi mt^Janogaster; and 

a synthetic splicing donor site,, 
and 3 vector B derived from pCasperhs, which has the heatshock 
prcanoter directed Gal4 activator domain-large T antigen fusion 
gene within polycloning site of the pCasperhs / 
v4iich method oonprises the steps of : 

(a) introducing each of the vectors A and B into the genomes 
of separate i^ahite minus flies; 

(b) selecting primary transformants for the vector A vAiich 
are resistcint to the drug, and selecting primary transformants 
for the vector B i^iich have an eye color; 

(c) crossing the primary transformants for the vector A with 
a transposase source strain to force the vector to junp into 
other locations; 

(d) selecting secondary transformants for the vector A by 
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picking up the flies having strong eye color; 

(e) crossing the secondary transformants vrLth the primary 
transformants for the vector B to obtain flies harboring both 

the vectors A amd B; 

(f) crossing the flies obtained in the st^ (e) with an UAS- 
luciferase harboring fly strain and measviring the regporter gene 
ej^ression of the resultant flies; and 

(g) identifying the trapped gene by cloning and sequencing 
the cDNAs fused to the r^xsrter gene and the gene re^»nsible 
for a detectable p^ienotype of the fly. 

Eniodiments of the second and third inventions are 
corresponded to the arbodiments of the first invention, and 
they will be more precisely described in the following 
descripticun . 



A Mode for Carrying Out the Invention 

A vector construct of the first invention, for exanple, 
can be based on the cccntnonly used, P-elanent transformation 
vector, pCasper3 (Pirotta, Vectors: A survey of molecular 
cloning vectors and their uses, eds. Rodriguez, R.L. & Denhardt, 
D.T., Butterworths , Boston, 437-456, 1998) and the convenient 
Gal4-UAS esqjression system (Brand and Perrimon, Developnent, 

118:401-415, 1993) . 

A promoterless Gal4 gene preceded by an artificial 
consensus splicing acceptor site and a synthetic ^^stpp/start" 
sequence to govern the read through translation coniing frcm 
upstream exon(s) of the trapped gene into the proper reading 
frame of Gal4 was inserted into the polycloning site of 
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pCasper3. 

The removal of the v^ole 3' UTR (untranslated region) 
segu^ce of the mini -white gene and replaoenient by an 
artificial splicing donor site resulted in a truncated gene 
without its own poly-adenylation site. 

Without a sucoessfial gene trapping evi^t this truncated 
mini -v*d. te gene was not expected to confer any eye color , 
therefore in this invi^tion a heatshock promoter directed 
neomycin-phosphotaransferase (hs-neo) g^e for helping selection 
of primary transformants by antibiotic feeding has been 
inserted. 

Figure 1 shows the schesnatic of the gene tzrap 

cons tract (pTr^>-hsneo) , and SEQ ID No,l is the ooamplete 
nucleotide sequ^ce of the vector pTrap-hsneo. 

Another gene trap construct, pTrap-G4-pS3 (Figure 2) is 
crated by replacing the Gal4 coding sequence of plasmid pTrap- 
hsneo with a Gal4 DNA binding dGniain-P53 fusic»i gene (ClCTitec±i, 
^fetchmaker Two Bi^rid SystCTi, #K1 605-1) . When this construct 
coexists in the genome of the same fly with another vector, 
pCasperhs-G4-LT (Figure 3) containing a heatshock promoter 
directed Gal4 activator domain-large T antigen (Clontech, 
Matchmaker Two Hybrid System, #K1605-1) fusion gene, the 
ass^Dbly of a functional Gal4 molecule, through p53-large T 
antigen interaction, can be regulated by external heatshock. 

this way, the possibility of an intentional bOTporary 
control of Ga l 4 activity became available. In other words the 
Gal4 e^q^ression in a pattern already determined spatially by 
the promoter of the trapped gene now can be induced at any 
desired stage of development by external heatshock. 



ail£E#¥ 11-3048126 



10-141952 



In order to make the detection of Gal4 ea^ression easier, 
the Gal4 gene in another ccostruct is replaced with a Gal4- 
firefly lucif erase fusion gene to get pTrap-G4 -luc (Figure 4) . 
This artificial gene is coding for a fusion polypeptide \^cdi 
has preserved both enzymatic activities- 

The easy measuring of luciferase activity by luminoassay 
(Brandes et al.. Neuron, 16:687-694, 1996) makes the detection 
of GteLl4 activity comfortable in every single living fly, 

Then , one of the best mode of the second or third 
invention, a method for gene trspping using the vector system, 
is described in detail « 
(1) Screening: 

The gene trap vector constructs can be introduced into 
the genome of a viiite minus fly by microinjection. rh& 
selection of primary transformants is possible hy using G418, 
an analog of necniycin, resistance conferred t>y hs-neo gene, 
(When performing transf orma ti on experiments wi th these 
constructs it's turned out that the tinmcated mini -white gene 
generally provides a very slight yellow eye color v^ch could 
be distinguished from w- phenotype in most of the cases, 
therefore G418 selection apparently is not necessary.) 

After a line vrith the gene trap construct is being 
established, the secondary transformants can be generated on 
the usual way by crossing the original line with a so-called 
junpstarter containing the transposase expressing delta 2-3 

genetic element. 

Usually a certain percentage, between 4 and 8, of the 
secondary transformants have much stronger eye color (deep 
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orange or reddish) than the ancsestor fly indicating that the 
cxxistruct was being inserted downstream of a prcmoter amd now 
the mini-T^iite gene is using the trcinscariptional ^^facilities'' 
of that gene (e.g.: poly-adenyla tion si te and transcriptional 
terminator) instead of its removed ones. Ihey are the most 
likely candidates for successful gene trsp events. In case of 
these lines the vector probably has been inserted either into 
an intron of a gene or upstream from the first intrcwi into the 
5' UTR in propter orientation (that is the direction of 
transcription is same for the ^^trapped gene" and the mini-vrtiite 
(and Gal4) genes as well) , The mini-i^diite gene has its own 
promoter therefore its es^ression pattern is supposed to be 
largely independent from that of the trapped gene. 

Ihese positiAre lines are to be ciiecJced in the next step 
for Gal4 e^iression by crossing them with a "^^marker" line 
harboring a UAS-lucif erase reporter gene construct , (When 
losing pTrap-G4-luc vector^ this step is obvioxisly not 
necessary.) Usually very strong correlation was found between 
eye color and Gal4 expression : more than 90% of the lines 
having strong eye color proved to be expressing Gal4 by means 

■ 

of lucif er^se a^say using luroincroeter (Brandes et al . , Neuron , 
16:687-692, 1996) , 

(2) Cloning; 

When the gene tr^ construct is being inserted into an 
intron of an endogenous gene, the marker genes of the construct 
are siqpposed to be spliced on mRNA. level to the exons of the 
trapped gene by using the artificial splicing acceptor and 
donor sites. More exactly v^le the Gal 4 mRNA. should be joint 
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to the exon(s) located upstream of the insertion site, at the 
same time the mini-white inRNA is fused to the following exon{s) 
=^rv^lishina the dual taaaing of the trsffped gene (Figure 5). 



This feature can be vised for quickly and 
identifying the trs^ped gene by means of 3' and 5' RACE (F^id 
Aitplification of cSJNA Ends) experiments. Even cloning and 
sequencing only a part of the caught itiRNA still provides 
reasonable chance to find homologous iriRNAs in the BDGP 
(Berkeley Drosc^la Genome Project) EST (Expressed Sequence 
Tag) library - 

With these approaches, the identification of an already 
cloned gene can take less then a week compared to the usually 
more than one year period in average when analyzing a mutant 
created by some enhancer trap construct. 

It's well-known from the literature and the present 
inventors also have eaqserienoed that P-element vectors tend to 
integrate into or near the 5' UTR of active genes. (The 
present inventors found that ±n these cases if the insertion 
happened v?>stream from the first intron, and therefore the 
artificial splicing acceptor site could not be utilized, the 
Gal4 gene was expressed by read-through transcription from the 

nearby promoter.) 

The advantage of this tendeaicy can be taken hy cloning 
and sequencing the flanking genomic sequences of the insertion 
site by inverse or vectorette PGR or by plasmid rescue using 
suitable restriction digestion to recover the neomycin 
resistance gene of the construct. Then again the BDGP library 
can be searched to find any significant matching. 
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(3) Rescue: 

Bie only reliable way to confirm that any observed 
mutant phenotype is really the cx)nsec[uence of the P-el^ment 
insertion is to rescue that particular phenotype. Expectedly 
the phenotype (some alteration from wild type fly) is caused by 
c±ianged expression of g^e(s) disturbed by insertion of the P- 
element. The rescue can be made by e3q>z:essing the cDNA. of the 
suspected gene most preferable with identical ^>atial and 
temporary pattern than that of the g^e itself. 

As it was expected, the vector constructs of the first 
inventi<xi usually cause strong phenotypes. It's not surprising 
at all beca u se the trapped genes are supposed to be split into 
two parts on ihRNA. level resulting in null xmitants in majority 
of the cases. Accordingly mutants obtained by this method 
freqpjently show homozygous lethality or sterility- Hypomorphic 
mutants can be obtained by forcing iitprecise excision of the 
gene txBp P-element construct. 

As mentioned above, the Gal4 expressicxi is obliged to 
reflect precisely to that of the trapped gene sirrply because 
the Gal4 gene has no its own prxxnoter and they share a cccrmon, 
fused mRNA- 

Tliis identical expression provides unique opportunity to 
rescue the mutant phenotype by crossing this fly with another 
one harboring the UAS directed, cloned cDNA of the trapped gene 

On this way either the original, homozygous null mutant 
gene trap fly or any transheterozygous derivative of that with 
scfne hypomorphic allele over the null mutant allele can be 
rescued. 
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(4) Determination of spatial and developinental expression 

pattern of the trapped gene: 

determination of the spatially and 
tCTporarily ccxitrolled expression of any trapped gene is also 
easy following introduction of a UAS-lacZ construct into the 
genome of the same fly and performing either X-gal or antiboc^ 
staining against beta-gsLlactosidase. 



<5) Mosaic analysis: 

Possession of a large collection of fly lines viith 
characteristic and, in the case of pTrap-G4- 
p53/pCasperhs-G4-TL vector systCTi, inducible Gal4 expression 
pattern makes feasible carrying out mosaic analysi s of 
virtually any gene of interest by directing the expression of 
their UAS-constructs on a mutant bacdcground vdith different Gal4 



This approach can answer the question of ^ere and when 
that particular gene is required to be expressed to rescue the 

mutant phenotype. 

Similarly, any gene can be expressed in different 
ectopic patterns to generate new dcffninant mutant phenotypes. 
This approach might help to conclude the role of that 
particular gene and to identify the pathway, in v^hich it's 
involved. 



The following example illustrates a specific eirbodiment 
of the various aspects of the invention. This exanple is not 
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Intended to limit the inArention in any itanner, 

Figiire 6 shows the results of sequencing RT-PCR products 
of aop-GaLL4 and m-\^iite-aop fusion nf^NAs. 

Itie tenplate was total RNA pr^>ared from a positive gene 
trap line vAiich has the vector pTrap-hsneo being integrated 
into the first intron of the well-known aop (anterior 
open/pokJoiri/yan) develqproental gene , -me sequences confirm 
that both splicing occurred precisely at that particular 
nucleotides of the artificial regulatory sequences where it was 
es^aected. 

On Figure 7, there are pictures of ciiaracteristic beta- 
galactosidase staining patterns in different parts of the fly 
brain resulted from crossing positive gene tr^ lines with 
flies harboring a UAS-lacZ construct. 

Effects of the Invention 

The vector system of this inventicai offers an 
exceptional opportunity for easy and fast cloning of the gene 
responsible for the observed phenotype. Furthermore, by using 
the UAS-driven coding sequ^ce of any gene of interest, that 
particular gene can be esq^ressed in identical patterns than 
those of the trapped genes and these expressions can be 
regulated temporarily at any desired developmental stage. 

Seqaence Listing 

SEQ ID No, :1 

USNGTH: 11206 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : double 
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TOPLOCT: circular 
MOI£CULAR TYPE: DNA 
FEAIURE: 

LOCATIONS : kind of* sequence 

0001-0237 : 3'P sequence 

0238-0274 : synthetic splicing acceptor site and 

stop/start secpenoe 
0275-3164 : Gal4 gene (coding region and 3'UTR) 
3165-3426 : hsp70 terminator 
3427-3457 : synthetic jianction sequence 
3458-4907 : heat shock promoter directed neccnycine 

resitance gene on ooniplementer strand 
4908-8275 : mini -white gene 
8276-8299 : synthetic splicing donor site 
8300-8446 : 5'P sequence 

8447-11206: bacterial part of pCasper3 shuttle vector 

including conplete pUC8 sequeaice 
0238-0274 : synthetic DNA 
3427-3457 : synthetic DNA 
4908-4914 : synthetic DNA. 
8276-8299 : synthetic DN?^ 
SEQ^CE: 

GATGATGAAA TAACATAAGG TGGTCCGGTC GQCAAGAGAC ATGGACTTAA GGTATGGTTG 60 
CAATAACTGG GAGTGAAAQG AATAGTATTC T6A6TGTCGT ATT6AGTCTG AGTGA6ACAG 120 
CGATATGAH GTTGATTAAC CCTTAGCATG TOCGTCaGGT TTGAATTAAC TCATAATATT 180 
AATTAGACGA AATTATTTTT AAAGTTTTAT TTTTAATAAT TTQCGAGTAC GCAAAGCTCT 240 
TTCTCTTACA OGTCGAAnC ATGTGATGGA TtXJAATGAAG CTACTGTCTT CTATCGAAGA 300 
AKJATGCGAT ATTTGCCGAC TTAAAAAGGT CAAGTQCTCC AAAGAAAAAC CGAAGTGCGC 360 
GAAGTGTCTG AAGAACAACT GGGAGTGTGG GTACTCTCCC AAAACCAAAA GGTGTCGGCT 420 
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GACTAQQGCA CATCTGACAG AAGTGGAATC AAGGCTAGAA AGACTGGAAC AGCTATTTCT 480 

ACT6ATTTTT (X3TCGAGAAG AOCTTGACAT GATTTTGAAA ATGGATTCTT TAGftGGATAT 540 

AAAAGCATTG TTAACAGGAT TATTTGTACA AGATAATGTG AATAAAGATG OCOn^CACAGA 600 

TAGATTQGCT TCAGTGGAGA CT6ATATQCC TCTAACATTG AGACAGCATA GAATAAfiTQC 660 

GAGATCATCA TCGGAAGAGA GTA6TAACM AGGTCAAAGA GAGTTGM}TG TATCGATTGA 720 

CTCGGCAGOr GATCATGATA AGTCCACAAT TCOGTTGGAT TTTATQOCCA GGGATQCTCT 780 

TCATGGATTT GATTQGTCTG AAGAGGATGA GATGTCG6AT QGCTIGOCCT TOCTGAAAAC 840 

QGAOCCCAAC AATAATGGGT TCTTTGGCGA CGGTTCTCTC TTATGTATTC TTCGATCTAT 900 

TGGCTTTAAA CCGGAAAATT PCACGMOTG TAACGHAAC AGGGTCCGGA OaVTGATTAC 960 

GGATAGATAC ACGTTGGCTT CTAGATOCAC AACATOCCGT TTACTTCAAA GTTATCTCAA 1020 

TAATTTTCAC COCTACTGCC CTATCGTQCA CTCACGGACG CTAATGATGT TGTATAATAA 1080 

GCAGATTGAA ATOGCGTCGA AGGATGAATG GGAAATCCTT TTTAACTOyV TATTAQCCAT 1140 

TGGAGCCTGG TGTATM3AGG GGGAATCTAC TGATATAGAT till I II lACT ATCAAAATQC 1200 

TAAATCTCAT TTGACGAQCA AGGTCTTCGA GTCAQGTTOC ATAATTTTGG TGACAGGGGT 12«) 

ACATCTTCTG TCGCSATATA CACAGTGGAG GCA6AAAACA AATACTAQCT ATAATTTTCA 1320 

CAGGTTTTCG ATAAGAATGG CCATATCATT GGQCTTGAAT AGGGACCTCC CCTGGTOCTT 1380 

CAGTGATAGC AGCATTCTGG AACAAAGACG OCGAATTTQG TGGTCTGTCT ACTCTTGGGA 1440 

GATCCAATTG TCCCTGCTTT ATGGTCGATC CATCCAGCR TCTCAGAATA CAATCTOCTT 1500 

CCCTTCTTCT GTCGAC6ATG TQCAGCGTAC CAGAACAGGT CCCACGATAT ATCAT(»CAT 1560 

CATTGAAAGA GCAAGGCTCT TACAA6TTTT CACAAAAATC TATGAACTAG ACAAAACAST 1620 

AACTQCAGAA AAAAGTCCTA TATGTGCAAA AAAATGCTTG ATGATTTGTA ATGAGATTGA 1680 

GGAGGTTTCG AGAGAGGCAC CAAAGTTTTT ACAAATGGAT ATTTCCACCA CCGCTCTAAC 1740 

CPA\ I Itil IG AAGGAACACC CTTGGCTATC CTTTACAAGA TTCGAACTGA AGTQGAAACA 1800 

GTTGTCTCrr ATCATTTATG TATTAAGAGA I II 1 1 IGACT AATITTACCC AGAAAAAGTC 1860 

AC^^AGAA CAGGATCAAA ATGATCATGA AAGTTATGAA 6TTAAACGAT GCTCCATCAT 1920 

GTTAMCGAT GOffiCACAAA GAACTGTTAT GTCTGTAA6T AGCTATATGG ACAATCATAA 1980 

TGTCACCOCA TATTTTGCCT QGAATTGTTC TTATTACTTG TTCAATGCAG TOCTAGTACC 2040 

CATAAAGACT CTACTCTCAA ACTCAAAATC GAATGGTGAG AATAACGAGA CCGCACAATT 2100 

ATTACAACAA ATTAACACTG TTGTGATGCT ATTAAAAAAA CTGGCCACTT TTAAAATCCA 2160 



11-3048126 



#^ 10—141952 



GACTTGTGAA AAAT/«^TTC AAGTACTGGA AGAGGTATGT GCGCCGTrTC TGTTATCACA 2220 

GTGTGCAATC CCATTACJCGC ATATC/^A TAACAATAGT AATQGTAGCG CXJATTAAAAA 2280 

TATTGTCGGT TCTQCAACTA TCGCOGAATA OCCTACTCTT CCG6AQGAAA ATGTCAACAA 2340 

TATCAGTGTT AAATATGITT CTCCTGGCTC AGTAGGQCCT TaWXJTGTGC CATTGAAATC 2400 

AGGAGCAAfiT TTCAGTGATC TAGTCAAGCT GTTATCTAAC OGTCCACOCT CTOGTAACTO 2460 

TCCAGTGACA ATACCAAGAA GCACACCTTC GCATOGCTCA GTCACQCCTT TTCTAGGGCA 2520 

ACAGCAACAG CTQCAATCAT TAGTQCCACT GADCCC6TCT GUIIItilllG GTQQOQCCAA 2580 

TTTTAATCAA AGTGGGAATA TTGCTGATAG CTCATTGTGC TTCW3TTTCA CTAACACTAfi 2640 

CAACGGTCCG AAOCTCATAA CAACTCAAAC AAATTCTCAA QCGCTTTIWJ AAGCAATTGC 2700 

CTCGTCTAAC GTT(»TGATA >«rrTCATGAA TAATGAAATC ACGGCTAGTA AAATTGATGA 2760 

TQGTAATAAT TCAAAAOCAC TGn^CACCTGG TTGGACQGM) CAAACTGCGT ATAACaXSTT 2820 

TQGAATCACT ACAGGGATGT TTAATACCAC TMIAATGGAT GATGTATATA ACTATCTATT 2880 

GGATGATGAA GATACGOGM; GAAACCCAAA AAAAGAGTAA AATGAATGGT AGATACTGAA 2940 

AAACOCCGCA AGTrCACTTC AACTCTGCAT CGTGCACCAT CTCAATTTCT TTCATTTATA 3000 

CATC6TTTTG OCTTCTTTTA T6TAACTATA CTCCTCTAAG TTTCAATCTT GGCGATGTAA 3060 

CCTCTGATCT ATAGAATTTT TTAAATGACT AGAATTAATG OCCATCTTTT TTTTGGACX3T 3120 

AAATTCTT<yV TGAAAATATA TTAOGAGGQC TTATTCAGAA QCTTATCGAT ACOSTCGACT 3180 

AAAGCCAAAT AGAAATTATT CAGTTCTaC TTAAGTTTTT AAAAGTGATA TTATTTATTT 3240 

GGTT6TAACC AACCAAAAGA ATGTAAATAA CTAATACATA ATTATGnTTAG TTTTAAGnA 3300 

GCAACAAATT GATTTTAGCT ATATTAGCTA CTTQGTTAAT AAATAGAATA TATTTATTTA 3360 

AAGATAATTC GTTTTTATTG TCAGGGAG[TG AGTTTGCTTA AAAACTCGTT TAGATCGACT 3420 

AGAAGGACC6 CGGCTCCTCG AOCGGATCGA AAGGAGGGCXS AAGAACTCX^A GCATGAGATC 3480 

OCGGCGCTGG AGGATCATCC AGCOGGCGTC OOGGAAAACG ATTCOGAAGC GCAACCTTTC 3540 

ATAGAAGQOG GGGGTQGAAT CGAAATCTCG TGATGGCAGG TrGGQCGTGG CTTQGTGGGT 3600 

CATTTCGAAC GGCAGAGTCC {X3CTCA6AAG AACTCGTCAA GAAQGGGATA GAAQQCGATG 3660 

CGGTGCGAAT GG^GCGGC GATACGGTAA AGCACGAGGA AGCGGTCAGC GCATTCGCCG 3720 

COAAGGTCTT GAGCAATATC AC(MGTAGCC AACGCTATGT CCTGATAGCG GTOGGtXJACA 3780 

CXXJAGCGGQC CACAGICGAT 6AAT0CAGAA AAGCGGC(^T TTTG(^CCAT GATATTCGQC 3840 

AAGCAQGCAT CGOCATGGGT CACGAOGAGA TGCTCGGOGT CGGGCATQCG OGCCTTGAQC 3900 
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CTGGOGAACA GTTOGGCTQG CGC6AGCCCC TGATGCTCTT CGTCCAGATC ATCCT6ATCG 3960 

ACAAGACCQG CTTCCATCOG AGTACGTGCT CGCTCGATQC GATGTTTCGC TTGGTGGTCG 4020 

AATGGGCAQG TAQOCGGATC AAGOGTATGC AGCCGCCQCA TTGCATCAfiC CATGATQGAT 4080 

ACTTTCTCQG CAG6AGCAAG GTGAGAT6AC AGGAGATCGT QCXXXX3GCAC TTCGCOCAAT 4140 

AQCAG0CA6T OCCTTCCOQC TTCAGTGACA A£X3TCGAGCA CAGCTGCGCA AGGAAOGOCC 4200 

GTCGTGGCCA GCCAOGATAG CGGCQCTGCC TCGTCCTGCA GTTCATTCAG GGCACOQGAC 4260 

AGGTGGGTCT TGACAAAAAG AACCGGGCGC GCCTGCGCTG ACAQOOGGAA CACQQOQQCA 4320 

TCAGAGCAGC CXSATTGTCTG TTGTGCCCAG TCATAGOOGA ATAQCCTCTC CACCCAAGCG 4380 

GOCGGAGAAC CTGCGTGCAA TCCATCTTCT TCAATCATGC GAAACGATCC TCATOCTGTC 4440 

TCTTGATCAG ATOCCCTATT CAGAGTTCTC TTCTTGTATT CAATAATTAC TTCTTQQCAG 4500 

ATTTCAGTAG TTGGAGTTGA TTTACTTGGT TGCTGGTTAC TTTTAATTGA TTCACTTTAA 4560 

CTTGCWmr AGTGCAGATT GTTTAGGTTG TT(^GCTGOG CTTGTTTATT TGGTTAGCTT 4620 

TOQCTTAQOG ACGT6TTCAC TTTGCTTGTT TGAATTGAAT TGTCGCTGCG TAGACGAAQC 4680 

GCCTCTATTT ATACTCCGGC GCTCTTTTCG CGAACATTGG AGGOGOGCTC TCTCGAACCA 4740 

AC6AGAQCAG TATGCCGTTT ACTGTGTGAC AG/«3TGAGAG AGCATTAGTG CAGAGAGGGA 4800 

GA6ACXXJAAA AA6AAAAGAG AGAATAAtXSA A^MCt^GOG^ GAGAAATTTC TCGAGTTTTC 4860 

TTTCTGCCAA AGAAATGAOC TACCACAATA ACCAGTTTGT TTTQGGATCT AGTCOCTAAT 4920 

TCTAGTATGT ATGTAAGTTA ATAAAACCCT TTTTTGGAGA AT6TAGATTT AAAAAAACAT 4980 

ATTTTTTTTT TATTTTTTAC TGCACTG6AC ATCATTGAAC TTATCTGATC AGTTTTAAAT 5040 

TTACTTCGAT CCAAGGGTAT TTGAAGTACG AGGTTCTTTC GATTACXTTCT GACTCAAAAT 5100 

GACATTCCAC TCAAAGTCAG CGCTGTTTQC CTCCTTCTCT GTCCACAGAA ATATCGCCGT 5160 

CTCTTTCGCC GCTGOGTCGG CTATGTCTTT CGCCACCGTT TGTAGCGTTA GCTAQCGTCA 5220 

ATGTOCGOCT TCAGTTGC^ TTTGTCAGCG GTTTCGTGAC GAAGCTCGAA GCGGTrTACG 5280 

CCATCAATTA AACACAAAGT GCTGTGCCAA AACTCCTCTG GCTTCTTATT TTTGI I IGI I 5340 

TTTTGAGTGA TTOSGGTGGT GATTGGTTTT GGGTGGGTAA GCAGQGGAAA 6TGTGAAAAA 5400 

TCCCGGCAAT GGGCCAAGAG GATCAGGAGC TATTAATTCG CGGAGGCAGC AAACACCCAT 5460 

CTGCCGAGCA TCTGAACAAT GTGAGTAGTA CATGTGCATA CATCTTAAGT TCACTTGATC 5520 

TATAGGAACT GCGATTGCAA CATCAAATTG TCTGCGGCGT GAGAACTQCG ACCCACAAAA 5580 

ATOCCAMOC GCAATCGCAC AAACAAATAG T6ACAC6AAA CAGATTATTC TG6TAGCTGT 5640 
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GCTCGCTATA TAAGftCAATT TTTAAGATCA TATCATGATC AAGAGATCTA AAGGCATTCA 5700 

nrrCGACTA CATTCI mm TACAAAAAAT ATAACAACXIA GATATTTTAA GCTGATCCTA 5760 

GATGGACAAA AAATAAATAA AAGTATAAAC CTACTTGGTA QGATACTTCG mTGnCGG 5820 

GGHAGATGA GCATAACQCT TGTAGTTGAT ATTT6AGATC CCCTATCATT QCAGQGTGAC 5880 

AQCGGACQCT TOQCAGAQCT GCATTAAOCA GQGCrrOQQG CAGQOCAAAA ACTAOGQCAC 5940 

eCTCCTOXJA CCCAGT(XX3C CGGAGGACTO CQGTTCAQQG AGCQQCCAAC TAGCC6AGAA 600O 

CCTCACCTAT GCCTQGCACA ATATQGACAT CTTTGGQSCG GTCAATCAGC CGGGCT(X»G 6060 

ATGGCGQCAS CTQGTCAACC GGACACGCCSG ACTATTCTQC AAOGAGCGAC ACATAlCCGQC 6120 

QCCCAGGAAA CATTTGCTCA AGAACGGTGA GrTTCTATTC GCAGTOGGCT GATCTCTGTG 6180 

AAATCTTAAT AAAGGGTOCA ATTACCAAn TGAAACTCAG TTTQtXJGCGT GGCCTAT(X» 6240 

GGCGAACTTT TQGCC6TGAT GGGCAGTTCC G6TGCGGGAA AGACGACOCT GCTGAATGOC 6300 

CTTGGCTTTC GATGGCGGCA GGGCATCCAA 6TATCGCGAT CCGGGATGOG ACTGGTCAAT 6360 

GQCCAACCTG TQGACGGCAA GGAGATGCAG GGCAGGTGOG COTATGTOCA GCAGGATGAC 6420 

CTCTTTATC6 GCTCCCTAAC GGCCAGGGAA CAOCTGATTT TCCAGGGCAT GGTGCG6ATG 6480 

CCAC6ACATG TGAGCTATC6 GCAGG6AGTG GG0CGC6TGG ATCAGGTGAT CCAGGAGCTT ^40 

TCGCTCAGCA AATGTCAGCA GACGATCATC GGTGTGGCCG GCAGQGTGAA AGGTCTGTCC 6«X» 

QGCGGAGAAA QGAAGCGTCT GGGATTCGGC TGCGAGGCyVC TAACXXSATOC GOGGCnCTG 6660 

ATCTGCGMG AfiOCCA£XJTC CGGACTQGAC TCATTTACOG CCCACAGCGT CGTCCAGGTG 6720 

CTGAAGAAGC TGTCQCAGAA GGQCIAAGACC GTCATCCTGA CCATTCATCA GGGGTCnOC 6780 

GAGCTGTTTG AQCTCTTTGA CAAGATCCTT CTGATGGCGG AQQQCAGGGT AGCTTTCTT6 6840 

GGCAGTCOCA GCGAAGCCGT CGACTTCnT TCCTAGT6AG TTCGATGTGT TTATTAAfflG 6900 

TATCTAQCAT TACATTAGAT CTCAACTCGT ATCGAGCGTG GGTGOCCAGT GTCCTACCAA 6960 

CTACAATCOG QCGGACTTTT ACGTACAGGT GnGGCCGTT GTQCCCGGAC GGGAGATCGA 7020 

GTCCOGTGAT CGGATCGCCA AGATATGCGA CAATTTTGCT ATTAGCAAAG TAGCCCGGGA 7080 

TATGGAGCAG TTGTTGGCCA CCAAAAATTT GGAGAAGCCA CTQGAGCAGC GQGA6AATGG 7140 

GTACAGCTAC AAGQCCACCT GGTTGATGCA GnCCGQGOG GTCCTGTGQC GATCCTGGCT 7200 

GTCGGTQCTC AAGGAAtXJAC TCCTGGTAAA AGTGCGACTT ATTCAGACAA CGGTGAGTGG 7260 

TTCCAGTGGA AACAAAT6AT ATAAOQCnA CAATTCTTQG AAACAAATTC GCTAGATTTT 7320 

AGTTAGAATT GOCTGATTCC AGACOCTTCT TAGTTTTTTT CAATGAGATG TATAGTTTAT 7380 
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AGTTTTQCAG AAAATA/VATA MTTTCATTT AACTCQCGWV CATSHGAAG ATAT6AATAT 7440 

TAATGAGAT6 GGAGTAACyVT TrTAATTTGC AGATQGTTQC aVTCTTGATT GQGCTCATCT 7500 

TTTTGGGOGA fiHAfCTOASX CAAGTGGGCG TGATGAATAT CAAOGGAGOC ATCTTCCTCT 7560 

TCCT6ACCAA CATGAOCTTT CAAAACGTCT TTGCCACGAT AAATGTAAGT CTTGTrTAGA 7620 

ATACATTTGC ATATTAATAA TTTACTAACT TTCTAATGAA TCGATTCGAT TTAGGTGTTC 7680 

ACCTCAGAGC TGCCAOrTTT TATGAQGGAG GCCOGAAGTC GACTTTATGG CTGTGAC^CA 7740 

TACTTTCTQG GCAAAACGAT TGOCGAATTA OOQCmTTC TCACAGTQCO ACTGGTCTTC 7800 

ACGGCGATT6 CCTATCGGAT GATCGGACTG CGQQCOGGAG TGCTGCACTT CTTCAACTGC 7860 

CTGGOGCTGG TCACTCTGGT GGCCAATGTG TCAACGTGCT TCQGATATCT AATATCCTGC 7920 

GCX^AGCTOCT C6ACCTC6AT GGCQCTGTCT GTGG6TCCGC GQGTTATCAT ADCATTOCTG 7980 

CTCTTTQGOG QCTTCTTCTT 6AACTGQQGC TOQGTQOCAG TATACCTCAA ATOGlTGrOG 8040 

TAGCTCTCAT GGTTCOGTTA CGCCAACGAG GGTCTQCTGA TTAACGAATG GGOGGAOGTG 8100 

GAGCOGGGOG AAATTAGCTG 0ACATCGTC6 AACAGCACGT GCCOOAGirTC GGGCAAG6TC 81^ 

ATCCTGGAGA CGCTTAACTT GTGCGCCGCC GATCTGCCGC TGGACTAGGT OGGTCTGGCC 8220 

ATTCTCAT06 TGAQCTTCOG QGTGCTCQCA TATCTQGCTC TAAGAGTTGG QGCCCGACQC 8280 

AAGGAGTAGA AG6TAAGTAG CGGCGGCADG TAAGGGTTAA TGTTTTCAAA AAAAAATTCG 8340 

TCCGCACACA ACCTTTCCTC TCAACMGCA AAC6TQCA{7T GAATTTAAGT GTATACTTOG 8400 

GTAAGGTTCG GCTATCGAGG GGACCACCTT ATGTTATTTC ATCATQQQCC AGACOCAOGT 8460 

AGTCCAGCGG GAGATGGGCG GCGGAGAAGT TAAGCGTCTC CAGGATGACC nGOCCGAAC 8520 

TGG33CACGT GGT6TTCGAC GATGTGCAGC TAATTTGGCC CGGCTOCACG TCCGCCCATT 8580 

GGTTAATGAG CAGAOGCTCG TTGGCGTAAC GGAACCATGA GAGGTACGAC AACCATTTGA 8640 

QGTATACTGG CACCGAQCCO GAGTTCAAGA AGAAGQOGTT TTTCCATAGG CTCCQCCCOC 8700 

GTGACGAGCA TGACAAAAAT CGAiGGCTCAA GTCAGAGGTG GCGAAACOGG AGAS3ACTAT 8760 

AAAGATACCA GGCGTTT(XX5 CCTG6AAGCT COCTCGTGCG CTCT0CT6TT CCGACCCTGC 8820 

CeCTTACCGG ATACCTGTCC GCXJITTCTGC CTTCGGGAAG OGTGGCSJTT TCTCAATGCT 8880 

CAGGCTGTAG 6TATCTCAGT TGGGTGTAGG TCGnCGCTC CAAQCTGGGC TGTGTGCAC6 8940 

AAOCCCOCGT TCAGCCCGAC CQCTGOGCOT TATOCQGTAA CTATCSTGTT 6AGTGGAACC 9000 

CQGTAAGACA 06ACTTATCG CCACTQGCAG CAGCCACTOJ TAACAQGATT AGCAGAQ06A 9060 

GGTATGTAQG CGGTQCTACA GAGTTCTTGA AGTGGTGQCC TAACTACGGC TACACTAGAA 9120 
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GGACAGTATT TGGTATCTGC GCTCTQCTGA AGCXy^GnAC CTTCGGAAAA AGAGTTQGTA 9180 

GCTCTTGATC CGGCAAACAA ACCACCQCTG GTAQCQGTGG 1 1 1 1 1 1 IGTT TQCAAGCAGG 9240 

AGATTADQCG CAGAAAAAAA GGATCTCAAG AAGATCCTTT GATCTTTTCT ACGGGGTCTG 9300 

AGGCTCAGTG GAACGAAAAC TCACGTTAAG GGATTTTGGr CATGAGATTA TCAAAAAGGA 9360 

TCTTCACCTA GATOCTTTTA AATTAAAAAT GAAGmTAA ATCAATCTAA AGTATATATG 9420 

AGTAAACTTG 6TCTGACAGT TA(XIAATQCT TAATCAGTGA GGCACXJTATG TCAGCGATCT 9480 

6TCTATTTCG TTGATCXJATA GnGOCTGAC TCCOCGTGGT 6TAGATAACT ACGATACQGG 9540 

AGGQCTTACC ATCTQQtXXX) AGTGCTQCAA TGATA0tX3CG AGACCCAOQC TCACCGQCTC 9600 

CAGATTTATC AGCAATAAAC CAGCCAGCOG GAAGGQCOGA GCGCAGAAGT GGTCCTQC5AA 9660 

CTTTATCCGC CTCGATCCAG TCTATTAATT GTTQCOGQGA AGCTAGAGTA A6TA6TTCGC 9720 

CAGHAATAG TTTGCQCAAC GnGTTQOCA TTGCTACAQG CATCGTGGTG TCACQCTOGT 9780 

CGTTTGGTAT GGCTTCATTC AGCTGCGGTT CCGAACGATC AAGOJGAGn MJATGATCOC 9Q40 

CGATGnTGTG CAAAAAAGCG GTTAQCTCCT TCG6TGCTCC GATCGHGrC AGAAGTTAAGT MOO 

TGQCOQCAGT GTTATCACTC ATG6TTATGG CAGCACTQCA TAATTCTCTT ACTGTCATGC 9960 

CATGCGTAAG ATGCTTTTCT GTGACTGGTG AGTACTCAAC CAAGTCATTC TGAGAATA6T 10020 

GTATGCGGCG AGGGAGTTQC TCTTQCCOQG C6TCAACAGG OSATAATACC GCGCCACATA 10080 

GCAGAACTTT AAAAGTQCTC ATCATTGGAA AACGTTCTTC GGGGOGAAAA CTCTCAAfiGA 10140 

TCTTAOCQCT GTTGAGATGC AGTTCGATGT AAOCCACTCG TGCACCCAAC TGATCTTCAG 10200 

CATCTTTTAG TTTCAOGAGO GTTTCTGGGT GAGCAAAAAC AGGAAQGCAA AATGCOGCAA 10260 

AAAAfiGGAAT AAGGGC6ACA CGGAAATGTT 6AATACTCAT ACTCTTCXJTT TTTCAATATT 10320 

ATTGAAGCAT TTATCAGGGT TATTGTCTCA TGAGtX3GATA CATATTTGAA TGTATTTAGA 10380 

AAAATAAACA AATAGQQGTT CCGCGCACAT TTCCCCGAAA AGTGCCAOCT GAOGTCTAAG 10440 

AAA(»ATTAT TATCATGACA TTAACCTATA AAAATAGGCG TATCACGAGG CCGTTTOGTC 10500 

TCGCGCGTTT CGGT6ATGAC GGTGAAAACC TCTGACACAT OlAGCTCCGG GAGACGGTCA 10560 

CAGCTTGTOT GTAAGCGGAT GCCGQGAGCA GACAAQCCCG TCAGGGOQDG TCAQCGGGTG 10620 

nOQCGGGrG TCGGGGOTQG CTTAACTATG CGGCATCAGA GCAGATTGTA CTGAGAGTGG 10680 

ACCATATGCG GTGTGAAATA CCGCA(XX5AA TCGCQCGGAA OTAACGACAG TGQCTCGAA6 10740 

GTCGTGGAAC AAAAGGTGAA TGTGnGCGG AGAQCGQGirG GGAGACAQCG AAAGAGCAAC 10800 

TAOGAAACGT GGTGTGGTGG AGGTGAATTA TGAAGAGQQC GGGCGATTTG AAAAGTAT6T 10860 



aiiE#5|I 11-3048126 



10—141952 



ATATAAAAAA TATATCCCGG TGTTTTATCT AGCGATAAAC GAGTTTTTGA TGTAAGGTAT 10920 

QCAQGTGTGT AAGTCTTTTG GTTAQAAGAC AAATGGftAAG TCTACTTGTG QQGATGTTCG 10980 

AAQQQGAAAT ACTTGTATTC TATAQGTCAT ATCTTGTTTT TATTQQCAOA AATATAATTA 11040 

CATTAGCTTT TTGAGGQGGC AATAAACAGT AAAGAC6ATG GTAATAATOS TAAAAAAAAA 11100 

AACAAGCAGT TATTTOSGAT ATATGTGGGC TACTCCTTO; GTGGGGCGCG AAGTCTTAGA 11160 

GGGAGATATG GGAGGAGCCG GAM3GTGAG6 ATGAGAATGG CCAGM; 11206 

Brief Description of Drawings 

Figure 1 shows the scJieniatic irap of the vector of this 
invention, pTrap-hsneo. 

Figure 2 shows the schematic map of the vector of this 
invention, pTr^p-G4-p53* 

Figure 3 shows the scheannatic map of the vector of this 
invention, pCasperhs-G4— LT. 

Figure 4 shows the scheniatic map of the vector of this 
invention, pTr^p-G4-luc, 

Figure 5 shows the shematic drawing of a fly g^ocne to 
T'^ch the vector of this invention is inserted for cloning. 

Figure 6 shows the results of sequencing RT-PCR products 
of aop-Gal4 and in-"v^iite-aop fusion mRNAs. 

Figure 7 presents pictures of characteristic beta- 
galactosidase staining patterns in different parts of the fly 
brain resulted frcm crossing positive gene trs^ lines with 
flies harboring a UAS-lacZ construct. 
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Gene-Trap vector construct 




m-RNA-s supposed to be expressed: 

I,) &ton 1 '-5- GAL4 



ma 



a.) 



rat 

TGA 



Ex0fi2 

(stop) 

Proteins supposed to be expressed: j^- 

1 . ) Gal4 fused to preceeding (upstream or N-terminal) exons. 

2, ) m-white being expressed itself, 

ABBREVIATIONS: 

3'P, 5*P flanking P-element sequeiKes 

consensus spticing acceptor site 

consensus spHctng donor site 



AS. 

OS. 
fjs. 



frame-sKift signals for directing nbosonrtes translating the proper 
reading frame of GaM n>-RNA (stop/start signal) 

hs70 transcription terminator sequence and poIy-A addition site 



Neo 
GAL4 



Neomycine resistance sequence for primary selection of transformants 



promoterless Gal4 gene 



nvwhite 



mini-white gene with it's poly-A addition site having removed 
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Abstract 

The ojects of this patent application are to provide a 
new vector system to facilitate the cloning and functional 
analysis of new genes of a fly, rtrosophila izzeJanogaster, and a 
method for gene trapping with the vector system. 

Hie present application provide a vector for tr^^ping an 
tinknown gene of Drascphllsi melanogaster, which is a reconibinant 
plasmid oonprising the following nucleotide sequences in this 
order: an artificial consensus splicing acceptor site; a 
synthetic ^stop/start" sequence; a r^xjrter gene; a drug 
resistance gene; a gene responsible for a detectable ph«iotype 
of the r^rosqp^la mBlanogststjeir; and a synthetic splicing^ donor 
site. The present application also provide a method for 
trapping an xinknown gene of a Drascphlla me-lanogaster by using 
the vector. 

Representative Drawing : Figure 1, 
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j^^T^5] Gal4 DNA*g-a'fI^^- P 5 3ill-^» 

[»^:^8] ^^RIK'S^'SM^. :t^:rv>r $/>-3^-^X;^;^^>>?>>'3l^- 
[M^^9] pCasperhs <D4<U ^^n- — t - h a ^ 
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□ ^ - - L ;t Gal 4 f^mtm^-^ - i^^T mmm^m&iF- ^m^^pcsis 

perhs ^^K^^-^ 

(b) mMWiix:^^ irkmm^^mi^^mmL, 

"TVZf (e) ^CfeV^TGal4©|§^S:^l!J^■r■^.»^:5l O^fettl 1 (Olsm. 
-iz'gfe-^it^e^-Z:-^ X-T^^yy (e) (Cfe V^T 2 7j?C?^®^^^S:U A S 

7aL^-if-^;t^^tx@B-^-ric-®g6-a^aed=-©i§^s:M^-r-5if*^i o t 
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5] ^MW-ffijt-e^-J^J'^^^'^-f ^>>-*^*^^>^^^^- 
b) T'G4 1 8ii&'ffi?^«^m#:^^^t-<5M#^l 0-&Vxbl 5 © V^-rtL*^(^):^ 

$:Z:®JlR;*T*-r-5ffiil^#^::^^X5 K-efe'S^^'^-A, ^ J; OfpCasperhs © 
tKU ^D-->^*Sl5fel^lCli- hS^a ^>^D=e-^-^l^-^b:/^Gal4?Sffi^l: 

(a) eBa^'fe^■S:^*fe'^^^BlIM©>'^3l(Z)>^fy AlC/<^ ^-A^<^:^J^^^7 3f- 
1^ ^ - B ® 1 

(c) ^^^-AcDi xmmM^^ ^ w^mm^m^m t vx^^^i-^ 

(d) m<D'^f)'^m^^/^:r.^U^-t ^ ^ tlZ ^ ^ - A(D 2 

(e) c:© 2r^?^K^^^5:^^^-B(D l?j?C?^M^^^i:^iHbTK^^- 



3 
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#5p 10—141952 

A i: K ^ tJf - B ® M # $: -5 A :ii $: ;g ai § i± . 

(f) :^^v:f (e) T'#feAai$:UA s -;i/i/7i^--fe'-^;t^^i:xiBL 

7] ^^^7 ^f-A;6'JpCasper3{Cffi5f5-t-g>y^>^^ K-^&S»^:^ 

1 Q<Dl3m. 

b) -eG4 1 8Wtt?^a^g|#:$:^^-r§0^:^l 6 5&V^L1 8 CD V^Ttl*^©^ 

[0 0 0 1 3 

3(Z)|§^«, Ui/^^Vn'O (Kd V:7>f ^ • Jij'yt.'r}]^ : Dros 
ophilamelamogaster) ©tf^iie^©^ □ - - >^i3 J:tJf*i|gi>|lf 

[0 0 0 2] 

[fi£5f5©g^ffi:^(DmS] 

MMt)^B:0'<^^'t^ (Gossler et al. Science, 244:463-465, 1989 ) „ 
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#^ 10—141952 
^— (Wilson et al, , Genes & Development, 3:1301-1313, 1989 ) CDJC&M. M 

G a 1 4 / U A S T ^ 5^ ^ ^ - ^ T T ^ if -f ^ {C Mi- -5 
htr=: lti&(D P/K- ^ -3t^-?-CD#g3^/t ^ - 1 SJe^±i^a^^{cMt- 

[0 0 0 3] 

«-r-g)^i:T'$>-5„ cfc»;pb<li, jA<^fl!StiT v^-g>x>7^>-y- 

® ^ -g> ^ 5: *l ^ f -5 ^ t T' ^ S = 
[0 0 0 4] 
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#5p 10—141952 



[0 0 0 5] 

^ 1 Ix^jf- ^ Gal 4 it^^. Gal 4 D N 

[0 0 0 6] 

^>y.'7 3L=7 — M^ABi^}l L/, ^(D:/n^ — ^-$rti— xy ^^ncE—^- 

[0 0 0 7] 

-^^x y ^ -r > ^^-^sp^ 

(b) ^Mtftt-r^-S 1 r>;|^M«il#:$:^#^L. 



miE#^ 11-3048126 




4f 5p 10—141952 



(e) 2 U A S (Ji^jStStt'fbSB^lI) -;i/i/-7ai^--^^^^^ 
[0 0 0 8] 

l/yJf-^-Jt'e^i: bT®Gal4 DN A^-^MJ^- P 5 3 ®l-^st^^> 
$:Z:<D]lH;*"e^-re7ia^^#:>''^>^^ KT'^'5'^^^^-A, S J; tJtpCasperhs O 

^ - B CD 1 



mitE#^ 1 1-3048126 



dff 3p 10—14 1952 
A h ^ ^ ^ - B CD $: ;t t" <5 A a: S m $ -Br, 

(f ) (e) T'#feAx$:UAS-;i/i:/7ai^--fe'-^^^^i:3SieL 

[0 0 0 9] 
[0010] 

^ — T^^^pCasperS (Pirotta, Vectors: A survey of ralecular cloni 
ng vectors and their uses, eds. Rodriguez, R.L. & Denhardt, D.T., Butter 
worths, Boston. 437-456, 1998 ) ^ J: tJ^^'g!^ Gal 4 - U A S (Brand 
and Perrimon, Development, 118:401-415, 1993) lCSt5 V^T^ili"'5 Z. A'^T 

[0011] 

■t"J&t>t). yp^-^— C^>^^^Gal4^'g^3^;^pCasper3(D/j<l) ^n-:=.>^*gp 
[0 0 12] 
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10—141952 

> — — if (hs— neo) ^-fe^^r^A b ito 
[0 0 13] 

0 1 > h ^ (pTrap-hsne) (Z)l5EliS:%0T^ *J , gB3?!I«-^ 1 It 

h v^y ^#^#:T'$>^pTrap-G4-p53 (0 2) :/^X^FpTrap- 

hsne<Z)Gal4 n- K-ffcSB^lI?: Gal 4 D N A^-^M^- p 5 3 gfe-^it'^^ (Clonte 
ch^tjg. Matchmaker Two Hybrid System, #K1605-1) TS^SI;! -5 Zl il tC j: U 

yffibfec 3©^^^*^, t:- ^y ^::^n^-^f-$:M-&L7^Gal4?gttfb 

m^-^-PTmm^^m-t^M<DK^ ^-pC^sperhs-GA-U (03) 

X(Z)>5f 7 AtfJlC^^-r-Si:, migffiGal4:9'^®T-fe>:/y-tj:, p5 3-^- 

[0014] 

>T©Gal4 V^*^^f|Sll- h 3 ^y <j: U fti^cO^pg-C^^-T § 

r *^nriB i: ^ o 

Gal4|§3^(D^m$:^^}C-t-5:t«)JC. S[I®^^#iCfeV>Tl*, GaU^t-fe^^ 
S:Gal4 -t5Jt'5;i/S/:7 3i^--^®!{-^5t'e^-Cfi^^;tTpTrap-G4-luc (04) 

[0015] 

m (Brandes et al., Neuron, 16:687-694, 1996) , 'f@^®^^fe>'^x^Cfe^tS 
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(1) -x^^y--^^ : 

C#At--5 3^*^T'^'5c 1 ^kl^%WkWo:>W!ri\%. h s - n e o^^^lCi: U 
[0 0 16] 

[0 0 1 7] 

mmW^^M-t^ ^^i:3?iBi-er ilfCd; t)^^$tl-g>o (pTrap- 

=mBift*^ B(Z)fei:Gal4|g^(Z)^iCB^nS : S <Dfe7b'5^V^^^® 9 0 %JS^_h 
i:*'^IE^$tlTV^^ (Brandes et al.. Neuron, 18:687-692, 1996) „ 



ffilE#¥ 1 1-3048126 



ijf 5|2: 10 — 141952 

(2) ^^n-->^ : 
(D-^-iJ- ^-e^ \t. AX(3!) X ^ ^ >r ^> > ^^^^^ J: tJ^#-§^^^ ^ffl L T m R 
t>tl-&o $ ^lCiE5tlC3«B^-5i:. Gal4 ©mRNAlilf A^^®JiSSlC'fe«i-S 

^<x^y> (^) tci6'^LTb^^y:^$tifeit^^-®2«^e^>fb^^^LTv% 

-5 (0 5) „ 
[0 0 1 8] 

Z:<3!)#^g!(tt, 3' ^i:tJ^5' RACE (Rapid Amplification of cDNA Ends : 

JcU^SH^J^^'^^X.t;, BDGP (Berkeley Drosophia Gemone Project ) 
EST (Expressed Sequence Tag) ^ >r U — tti:&^ ^l^V>5i^-effiH69^mR 

N A $: ^ m-r r ;?)^ RTIi ^ S o 

[0 0 19] 

mmw- t; u ffi ^ fcg^^^M r$>*f jc a am i*^^ 1 1 1 ^^^-t® 

»9i^S:i|-r-&®JcMbT. roD#§^(Z):;^a®«-^icii lii^^l^ilT'*'^^^®!^^ 

BTfg ii ^ -6 =fe § o 

3g&^^PCR%b<«vectorettePCRlCj:oT, ^feli 
a^r^lCj: U5WM-t-g>r i:;6^T'#-5>o Z1(D®-^%B DG P^-f U -^r^^UT 



ffi|iE#¥ 11-3048126 



#^ 10—141952 



(3) l':7s^=L- : 

^ tifeg^^^M®^JiM*^lll^lC P o: ^ > h A®^:^® 3 o }t 3 i: ?: 

^61;^ J: A - > "x?*^^ ^ L/ < s 3 tc cfc y $ n -5 c 

[0 0 2 0] 
[0 0 2 1] 

$:=fe);t-f^ji®il6-a-mRN A$:^;tt--5i:v^e)aaA^e), h^^y :;^$tlfc51'g^• 
-r-ss'jOAoihr^/Nai^^rxgBt- -acinic cfc g^^^Mcz)^3^§?S: i/x^^. 

- ^ z: i: 5: «rig -r -5 o 

[0 0 2 2] 

(4) h^v:f^nr=.m^'?'(D^mmi5^xj^m^m^m^^^-y(Dm^ : 



ffi|iE#^ 1 1-3048126 



4^512. 10 — 141952 



(5) ^if>f ^^5^«f : 



Mm 



[0 0 2 3] 

^ «t> ICigl^il ^ ti -6 *^ V -9 ^ S *J ^ 

7!j^ B8 # f -5 S& ^ 1^ ^ "t* ^ ^ b ^ V ^ o 

[0 0 2 4] 

[^JfeM] 



6li. aop-Gal4 J3J;t^m-white-aop ife^m R N A© R T - P C R^«^(D@e 



[0 0 2 5] 

^Sli, ^^©a o p (anterior open/pokkur i/yan ) %^^MB^^0)9.^<O>{ 
[0 0 2 6] 

7-e«, lais^'e?- h ^ ^ U A S - l a c Z#||#:$:'^*-rSAX 
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4t3jl 10 — 141952 



[0 0 2 7] 
[0 0 2 8] 

[se^j^] 

SB^J#-^: 1 

@H^J(55;R$ : 1 1 2 0 6J^aM 

mm (DM : mm 

M^OMM : DN A 

0001-0237:3' P SE^J 

0238-0274 : -^^7. ^''•^ -f > ^^^Smfe J: tJ^ 

0275-3164: GaUa-g^ (n- F^tSM^^^-fctJ^S ' UTR) 

3165-3426 : hsp70^-^:^-^- 

3427-3457 : ^f^i&^mm 

3458-4907 : Ziy:fV:^ > ^-^JiO^l— h U B -fU=E— — 

4908-8275: ^ :^aaSst>e^ 
8276-8299 : ^^y^^^ ^ i/y ^"^i^^^ 



ffilE^f ¥ 1 1-3048126 



10—141952 



8300-8446:5' PfB^J 

8447-11206: p U C 8 B2^J$:'g•t^pCaspe^3iy •\' hA/^-^^-O 



0 2 3 8 


-0 2 74 : 












3 4 2 7 


-3 4 5 7 : 


-^^DN A 










4 9 0 8 


-4914 : 


-a-^DNA 










8 2 7 6 


-8 2 2 9 : 






• * 






le^fj : 














CATGATGAAA 


TAACATAAGG 


TGGTCCCGTC 


GGCAAGAGAC 


ATCCACTTAA 


CGTATGCTTG 


60 


CAATAAGTGC 


GAGTGAAAGG 


AATAGTATTC 


TGAGTGTCGT 


ATTGAGTCTG 


AGTGAGACAG 


120 


CGATATGATT 


GTTGATTAAC 


CCTTAGCATG 


TCCGTGGGGT 


TTGAATTAAC 


TCATAATATT 


180 


AATTAGACGA 


AATTATTTTT 


AAAGTTTTAT 


TTTTAATAAT 


TTGCGAGTAC 


GCAAAGCTCT 


240 


TTCTCTTACA 


GGTCGAATTG 


ATGTGATGGA 


TCCAATGAAG 


CTACTGTCTT 


CTATGGAACA 


300 


AGCATGCGAT 


ATTTGCCGAC 


TTAAAAAGCT 


CAAGTGCTCC 


AAAGAAAAAC 


GGAAGTGCGC 


360 


CAAGTGTCTG 


AAGAACAACT 


GGGAGTGTCG 


CTACTCTCCC 


AAAACCAAAA 


GGTCTCCGCT 


420 


GACTAGGGCA 


CATCTGACAG 


AAGTGGAATC 


AAGGCTAGAA 


AGACTGGAAC 


AGGTATTTCT 


480 


ACTGATTTTT 


CCTCGAGAAG 


ACCTTGACAT 


GATTTTGAAA 


ATGGATTCTT 


TACAGGATAT 


540 


AAAAGCATTG 


TTAACAGGAT 


TATTTGTACA 


AGATAATGTG 


AATAAAGATG 


CCGTCACAGA 


600 


TAGATTGGCT 


TCAGTGGAGA 


CTGATATGCC 


TCTAACATTG 


AGACAGCATA 


GAATAAGTGC 


660 


GACATCATCA 


TCGGAAGAGA 


GTAGTAACAA 


AGGTCAAAGA 


CAGTTGACTG 


TATCGATTGA 


720 


CTCGGCAGCT 


CATCATGATA 


ACTCCACAAT 


TCCGTTGGAT 


TTTATGCCCA 


GGGATGGTCT 


780 


TCATGGATTT 


GATTGGTCTG 


AAGAGGATGA 


GATGTGGGAT 


GGCTTGCCCT 


TCCTGAAAAC 


840 


GGACCCCAAC 


AATAATGGGT 


TCTTTGGCGA 


GGGTTCTCTC 


TTATGTATTC 


TTCGATCTAT 


900 


TGGGTTTAAA 


CGGGAAAATT 


ACACGAACTG 

> 


TAACGTTAAC 


AGGCTCCCGA 


CCATGATTAG 


960 


GGATAGATAC 


ACGTTGGCTT 


CTAGATCCAG 


AACATCCCGT 


TTACnCAAA 


GTTATGTCAA 


1020 


TAATTTTCAC 


CCCTACTGCC 


CTATGGTGCA 


CTCACCGACG 


CTAATGATGT 


TGTATAATAA 


1080 


CCAGATTGAA 


ATCGCGTCGA 


AGGATCAATG 


GCAAATCCTT 


TTTAACTGCA 


TATTAGCGAT 


1140 


TGGAGCCTGG 


TGTATAGAGG 


GGGAATCTAC 


TGATATAGAT 


GTTTTTTACT 


ATCAAAATGG 


1200 


TAAATCTCAT 


TTGACGAGCA 


AGGTCTTCGA 


GTCAGGTTCC 


ATAATTTTGG 


TGACAGCCCT 


1260 
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4f 10 — 141952 



ACATCTTCTG TCGCGATATA CACAGTGGAG GCAGAAAACA AATACTAGCT ATAATTTTGA 1320 

CAGCTTTTCC ATAAGAATGG CCATATCATT GGGCTTGAAT AGGGACCTCC CCTCGTCCTT 1380 

CAGTGATAGC AGCATTCTGG AACAAAGACG CCGAATTTGG TGGTCTGTCT ACTCTTGGGA 1440 

GATCCAATTG TGCCTGCTTT ATGGTCGATC CATCCAGCTT TCTCAGAATA CAATCTCCTT 1500 

CCCTTCTTCT GTCGACGATG TGCAGCGTAC CACAACAGGT CCCACCATAT ATCATGGCAT 1560 

CATTGAAACA GCAAGGCTCT TACAAGTTTT CACAAAAATG TATGAACTAG ACAAAACAGT 1620 

AACTGCAGAA AAAAGTCCTA TATGTGCAAA AAAATGCTTG ATGATTTGTA ATGAGATTGA 1680 

GGAGGTTTCG AGACAGGCAC CAAAGTTTTT ACAAATGGAT ATTTCCACCA CCGCTCTAAC 1740 

CAATTTGTTG AAGGAACACC CTTGGCTATC CTTTACAAGA TTCGAACTGA AGTGGAAACA 1800 

GTTGTCTCTT ATCATTTATG TATTAAGAGA TTTTTTCACT AATTTTACCC AGAAAAAGTC 1860 

ACAACTAGAA CAGGATCAAA ATGATCATCA AAGTTATGAA GTTAAACGAT GCTCCATCAT 1920 

GTTAAGGGAT GCAGCACAAA GAACTGTTAT GTCTGTAAGT AGGTATATGG ACAATCATAA 1980 

TGTCACCCCA TATTTTGCCT GGAATTGTTC TTATTACTTG TTCAATGCAG TCCTAGTACC 2040 

CATAAAGACT CTACTCTCAA ACTCAAAATC GAATGCTGAG AATAACGAGA CCGCACAATT 2100 

ATTACAAGAA ATTAACACTG TTCTGATGCT ATTAAAAAAA CTGGCCACTT TTAAAATCCA 2160 

GACTTGTGAA AAATACATTC AAGTACTGGA AGAGGTATGT GCGCCGTTTC TGTTATCACA 2220 

GTGTGCAATC CCATTACCGC ATATCAGTTA TAACAATAGT AATGGTAGCG CCATTAAAAA 2280 

TATTGTCGGT TCTGCAACTA TCGGCCAATA CCCTACTCTT CCGGAGGAAA ATGTCAAGAA 2340 

TATCAGTGTT AAATATGTTT CTCCTGGCTC AGTAGGGCCT TCACCTGTGC CATTGAAATG 2400 

AGGAGGAAGT TTCAGTGATC TAGTCAAGCT GTTATCTAAC CGTCCACCCT CTCGTAACTC 2460 

TCCAGTGACA ATACCAAGAA GCACACCTTC GCATCGCTCA GTCACGCCTT TTCTAGGGCA 2520 

ACAGCAACAG CTGCAATCAT TAGTGCCACT GACCCCGTCT GCTTTGTTTG GTGGCGCCAA 2580 

TTTTAATCAA AGTGGGAATA TTGCTGATAG CTCATTGTCC TTCACTTTCA CTAACAGTAG 2640 

CAACGGTCCG AACCTGATAA CAACTCAAAG AAATTCTCAA GCGCTTTCAC AACCAATTGC 2700 

GTCCTCTAAC GTTCATGATA ACTTCATGAA TAATGAAATC ACGGCTAGTA AAATTGATGA 2760 

TGGTAATAAT TCAAAACCAC TGTCACCTGG TTGGACGGAC CAAACTGCGT ATAACGCGTT 2820 

TGGAATCACT ACAGGGATGT TTAATACCAC TACAATGGAT GATGTATATA ACTATCTATT 2880 

CGATGATGAA GATACCCCAC CAAACCGAAA AAAAGAGTAA AATGAATCGT AGATACTGAA 2940 

AAACCGCGCA AGTTCACTTC AACTGTGCAT CGTGCACCAT CTCAATTTCT TTCATTTATA 3000 

1 6 ffiliE4f ^ 1 1-3048126 
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CAlObllllu 


PPTTPTTTT A 


THTA ATTAT A 
lul nAUi Ai A 


PTPPTPTA AP 


TTTf.A ATfTT 


nnCCATGTAA 

VJvj wWXi X \J JL XIXX 


3060 


CUlbluAlUl 


A T A P A A TTTT 
Ai AbAAl i i 1 


TTA A ATHAPT 


AHA ATTA ATP 


rrr ATf.TTTT 


TTTTGGACCT 


3120 


AAA TT'^T'T'/^ A 

AAAllUllLA 


TP A A AATATA 




TTATTPAHA A 


nPTTATHHAT 

VJw X X a 1 OVJ A X 


ACCGTCGACT 


3180 


AAA f^Cr* A A A T 

AAAGCCAAAl 


AP A A ATT ATT 
AbAAAi lAi 1 


p APTTPTrrp 

bAul 1 Uluub 


TTA A PTTTTT 
X X AAu X i X X X 


AA A APtTHATA 

A A A AO X U A X A 


TTATTTATTT 

xXAi. XXAX XX 


3240 


GGTTGTAAUU 


A A PP A A A AP A 

AAUUAAAAbA 


ATPTA A ATA A 
Alul AAAIAA 


PTA ATAP ATA 
UX AAX AL»AX A 


ATTATPTTAP 

AX XAXUX X AVJ 


TTTTAAGTTA 

X X JL X XI AVJ X X A 


3300 


r*r^ A A A A A TTT 

GCAACAAAll 


P A TTTT A PPT 
bAl 1 11 Abb 1 


ATATT APPT A 
A i Al 1 Aubl A 


PTTPPTTA AT 
bX XuUX X AAX 


AA ATAPAATA 

AAA X AUA A X A 


TATTTATTTA 

XAXX XAXXXA 


33B0 


A * O A f A A 'I'T'/^ 

AAGATAATTC 


PTTTTT A TTP 

bi 1 1 1 1 Al lb 


TP APPP APTP 
IbAuubAblu 


APTTTPPTT A 
AbX X XbbX X A 


A A A APTPHTT 

AAAAL/XOVjX X 


TAnATGCACT 

X AVJ A X V^V^ AO X 


3420 


AGAAGGACCG 


bbbblOblbb 


kcrrr atpp a 

AbbbbAlUbA 


A Kcr hcrrrr 

AAbbAbubbb 


A APA APTPP A 
AAu AAU 1 \jLf& 


PPATHAPATn 

VJU/ A X VJAVJ A X \J 


3480 

tJ*±tJ V/ 


CCCGCGCTGG 


A PP A TP A TPP 

AGGAlUAlbb 


Abbbbbbb 1 b 


rrrch a a app 

bbbbAAAAbb 


ATTPPPA APP 
AX XUOuAAulj 


PP A A rPTTTH 

V-/V^AAOV>X X XV^ 




ATAGAAGGCG 


bbbblbbAAl 


PP A A ATPTPP 
bbAAAlblbb 


TP A TPPP APP 
1 bAl bbb Abb 


TTPPPPPTPP 
X X uVjUUVj X 


PTTPCTPHPT 

Vy X X VJVJ X wVJVJ X 


OvJv V/ 


CATTTCGAAC 


r^r^r* a p a ptpp 

CCCAbAblbU 


ppprpp A p A A P 

bbblbAbAAb 


A APTPPTP A A 
AAblbblbAA 


P A APPPPATA 
Vj AAuuLrU A X A 


PA APCPPATn 

VjAAUVJV>VJ A X VJ 




CGCTGCGAAT 


CGGbAbbbbb 


P AT APPPT A A 

bAl Abbbl AA 


APP hrc kCC A 
AbbAbbAbbA 


APPPPTP APP 
AbUuu X UAuo 


PP ATTPPPPn 

wv/AX IVvVJV^V-^VJ 


^7?0 


CCAAGCTCTT 


P A PP A A T A TP 

CAGuAAlAlb 


APPPPT APPP 
AbbbblAbbb 


A APPPTATPT 
AAbbbl A Ibl 


PPTP ATAnPP 
ULf X u A X AuUU 


PTPPnPPAHA 

Vj X 1_/VjVjV-^V^ AVy A 


^780 


CCCAGCCGGC 


r* A p A r"vr*r* a t 
CACAblbbAi 


P A ATPP A P A A 

bAAlbbAbAA 


A kcrcccc A T 

AAbbbbbbA 1 


TTTPP APP AT 
X X XUoAL/UAX 


HATATTPPPP 

VJA XAXX Vy VJVJVy 


^840 

OOMV/ 


AAGCAGGCAT 


r*r^r*r* atpppt 
CGGLAlbbbl 


P APP A PP AP A 
bAbbAbbAbA 


TPPTPPPPPT 
1 bb 1 bbbbb 1 


PPPPP ATHPP 


PPPPTTPAPn 

L/VlwL/ X X VJAVJV^ 


OC V V/ 


GTGGCGAACA 


PTTPPPPTPP 

bl Ibbbblbb 


bbbbAbbbbb 


TP A TPPTPTT 
IbAlbblbl 1 


PPTPP APATP 
X Au A X 


ATPPTPATCPt 

A X \JKJ X VJA X VyVJ 




ACAAGACCGG 


PTTPP A TPPP 
b 1 1 bb A 1 bbb 


A PT A PPTPPT 
Abl Abblbbl 


PPPTPP ATPP 
bbb 1 bb A 1 uU 


P ATPTTTPHP 
VjAXuX X XUuU 


TTPPTGCTCn 

X X VJVJ X VJVJ X VjfVJ 


4020 

rx V/xj V/ 


AATGGGCAGG 


T A r*ncnr* a tp 

lAbbbbbAlb 


A APPPTATPP 
AAbbbl Albb 


trrrrrrrr a 

AbbbbbbUUA 


TTPPATP APP 

X X VjOA X L» AULf 


P.ATPATPGAT 

V^ A X Vj A X VJVJ A X 


4080 

*I\/VJ\/ 


ACTTTCTCbb 


P A PP APP A A P 

bAbbAbbAAb 


PTP AP ATP AP 
blbAbAlbAb 


APP AP ATPPT 
Auu Au A X vjKj X 


PPPPPPPPAP 


TTCGCnCAAT 

X X WVJ wwv^A A X 


4140 


AGCAGCGAbl 


bbbl Ibbbbb 


TTP APTP A P A 
1 IbAblbAbA 


APPTPPAHP A 
AUu X AvjLf A 


P APPTPPHPA 


APnAACGCCC 

A VJ VJ A A Vj V^ w 


4200 


GTCGTGbUUA 


PPP A PP A T A P 
bbbAbbAl Ab 


rrrrrcTcrr 

bbbbbLf 1 ulyLf 


TPPTPPTPIP A 
X ou X Lfw X VjU A 


PTTP ATTHAP 

U X X KjA X X AVJ 


nnnACCGGAC 

VJ VJ V> A V^OVJ VJ A u 


4260 


AbblUbblUl 


TP APAAAAAP 
IbAbAAAAAb 


A hcrrrrrcr 

AAoUuuuLfUU 


PPPTPPPPTP 


APAPnpnnAA 

A A \J w w O Vj A il 


CACGGCGGCA 


4320 

JLi V/ 


Tr A p A rr* Apr* 
lUAbAbUAbU 


PP ATTPTPTP 
bbAl lb lb lb 


TTnTnPPP AH 


TPATAHPPPA 


ATAnnrTCTC 


CACCCAAGCG 


4380 


bbbbbAbAAU 


PTPPPTPP A A 
bl bbulubAA 


TPPATPTTHT 


TPA ATPATHP 

X wA A X wA X \J\j 


nAAACGATCC 


TCATCCTGTC 

X V^ A X \J\J X \J X \J 


4440 


TPTTO A TP A P 

lUi IbAiUAb 


A TPPPPT A TT 
A 1 bbbb 1 A 1 1 


P AP APTTPTP 
U Au Au i 1 v-f i O 


TTP TTHT A TT 

X XOX XvJlAX X 


PAATAATTAG 

wAA X A A X 1 JX\J 


TTCTTGGCAG 

X X V/ X X VJVJ V> AVJ 


4500 


ATTTCAGTAG 


TTGCAGTTGA 


TTTACTTGGT 


TGCTGGTTAC 


TTTTAATTGA 


0« B MWik ivn A M 

TTGACTTTAA 


4560 


CTTGCACTTT 


ACTGCAGATT 


GTTTAGCTTG 


TTCAGCTGCG 


CTTGTTTATT 


TGCTTAGGTT 


4620 


TCGCTTAGCG 


ACGTGTTGAG 


TTTGCTTGTT 


TGAATTGAAT 


TGTCGCTCCG 


TAGACGAAGC 


4680 


GGCTCTATTT 


ATACTCCGGC 


GCTCTTTTCG 


CGAAGATTCG 


AGGCGCGCTC 


TGTGGAACCA 


4740 
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A OVJ A \J A O O il U 


TATGCCGTTT 


ACTGTGTGAC 

AV^ X VJ 1 VJ X \3£x\J 


AGAGTGAGAG 


AGCATTAGTG 

AJIVJ ^>^aX X X AA\J X VJ 


CAGAGAGGGA 

AA VJ AA \J AA ^J ^J A^ 


4800 




AAHAAAAGAG 

AAVJ AA AAVJAVJ 


AGAATAACGA 

AVIAA X AA A 


ATAACGGCCA 


GAGAAATTTC 


TCGAGTTTTC 


4860 

A V^ V^ 


TTTPTHPfA A 


APA AATGAnr 

A wA AA X VJ AWw 


TACCAnAATA 

X AV^WAV/A A X A 


ACCAGTTTGT 

£X\J\J IWJ X X X VJ X 


TTTGGGATCT 

XXX VJ VJ VJ AX X vy X 


AGTCCCTAAT 

AXVJ X. ^htV^Am/ X AXAA X 


4920 


X U 1 AVj 1 a 1 u 1 


ATHTAAHTTA 

A X Vj X A AVJ X X A 


ATAAAAGCCT 

A X A AA AWV^V/ X 


TTTTTGGAGA 

X X X X X UUAU A 


ATGTAGATTT 

J-X X VJ X AXXJAX XXX 


AAAAAAACAT 


4980 


Allllilill 


TATTTTTTAP 

XAXXXXXXAw 


TGPAPTGGAC 

X VJV/AVy X VJVJAw 


ATPATTGAAG 

A 1 Vd/ A X X U A AVy 


TTATCTGATC 

X XAXwXVJXXX \J 


AGTTTTAAAT 

XI VJ X X X X XXXXXX X 


5040 


TTAPTTrn AT 


PP A APPHTAT 

OL/ A AVJUVJ X A X 


TTGAAPTAPr 

X X VJA AVJ X AVy w 


AGGTTGTTTG 

AVJVJ X X Vy X X X U 


GATTACCTCT 

U A X XAVd/VyXVX 


CACTCAAAAT 

\J1WJ X V/XXXxXXXX X 


5100 

XJ X \I V 


r AT ATTPP AT 


TP A A APTPAP 

X oAAAu X oAu 


PPPTPTTTPP 
ovjv> X Vj X X X 


PTPPTTPTPT 

\j 1 VvO X X V^ X w X 


GTPPACAGAA 

U X V/V^AWAU A A 


ATATCGCCGT 

A X A X V^UV/V^U X 


5160 

O X w v 


pTPTTTrrrr 


PPTPPPTPPP 


PTATPTPTTT 

V_;XAXOXV^XX X 


PGPPAPPGTT 

V^VJwVyAOvjU X X 


TGTAGCGTTA 

X VJ X AUVyU X X A 


CCTAGCGTGA 

\J\J X XXVJ V/\J X \JXX 


5220 


hTCTrrrrcT 

A 1 u 1 LrULiL/vy i 


TP APTTPPAP 
XUAvjX XVjVjAO 


TTTPTPAPPP 

X X X Vj 1 V^AVjUVj 


PTTTPPTGAP 

VJl X XwUXVJAV-/ 


PA AGGTPPAA 

U A AUV^ X V/Vj AA 


GGGGTTTACG 

UwUU XXX AV/U 


5280 


Cr hTCh ATT A 
OuE i UaA 1 1 a 


A AP AP A A APT 

A AU AL/AA Au X 


PPTPTPPPA A 

KjKj X Vj 1 VjOVjA A 


A APTPPTPTP 


GPTTPTTATT 

UVy XXV^XXAXX 


TTTGTTTGTT 

XXXUX XXUX X 


5340 


TTTTP A PTP A 
111 XuALiluA 


TTPPPPTPPT 
X X LjuvjVj X Vju X 


P A TTPPTTTT 
VjA X X VjvjX XXX 


PPPTPGPTA A 

VJVJVJ X uuVJ X AA 


GPAGGGGAAA 

UV^AUUUU AAA 


GTGTGAAAAA 

U X U X U A A AAA 


5400 

VJT\/ v 


Tcrrcrc k at 


rrnrr a apap 

U VI u UL/ A A VJ A Vj 


PATPAPPAPP 
VjA X V_/AviVj AVjVj 


TATTA ATTPP 

X A X X A A X X OU 


PGGAGGPAGP 

Vy VJU A U U V^ A U Vy 


AAAGAGGCAT 

AAAV^AV/v/V^A X 


5460 


cTcrcr A rc a 

L; 1 uLfL»u AuUA 


TPTPA AP A AT 
X U X VjA AL/AA X 


PTPAPTAPT A 

Vj X Vj AVj X AVj X A 


PATPTPPATA 

V^A X U X UV>A X A 


PATPTTAAGT 

\jA X W X X A AU X 


TGAGTTGATC 

X V>AV> X X U A X \J 


5520 

O tJ ^ V/ 


TAT Apr A APT 
1 Al AUuAAL/1 


PPP ATTPPA A 
VjOVjAX XVjUAA 


P ATP A A ATTP 

V^ A X V^ A A A X X Vj 


TPTPPPPPPT 

X X VJwUUVyVJ X 


GAGAAPTGCG 

U AU A AV^ X UOU 


AGCGAGAAAA 

A V^ Vy V/ A Vd/ A A A A 


5580 


ATPPPA A APP 
A 1 OOUA AAOO 


PPA ATPPPAP 
VjUA A X V^VjV^AV^ 


A A AP A A ATAP 

AAAL/AAA X AVj 


TPAPAPGAA A 

X UAOAV-'VJAAA 


PAGATTATTC 

VyAU A X X A X X Vy 


TGGTAGGTGT 

X UU X AUW X U X 


5640 


PPTPPPTATA 
uUIUVjUI Al A 


T A APAP A ATT 

XAAVjAVjtAAX X 


TTTA APATP A 

X X IAAVjAXOa 


TATPATGATP 

X A X V^A X VJA X \J 


AAGACATCTA 

A AU Aw A X V/ X A 


AAGGCATTCA 

XXXXVJ VJ\_/XX X X \JJlX 


5700 


TTTTPn APTA 
ill lUUAOX A 


P ATTPTTTTT 

OA XXwXXXXX 


TAPA AAA AAT 

X AV/AA AA AA X 


ATAAGAAGGA 

A 1 A AVyAAV/V/A 


GATATTTTAA 

U A X A X X X X AXl 


GCTGATCCTA 

VJ \J X VJ XX X \J\J X XX 


5760 

\^ • V^ V 


PATPPAPA A A 

VjA X UL/AL/ A AA 


A AATAA ATA A 

AAA X AAA X AA 


AAGTATA AAP 

AAVJ X a X A AA W 


PTAGTTGGTA 

\J 1 J\\J X X VjU X A 


GGATACTTCG 

vjvjxixxjIv>x X vy vj 


TTTTGTTCGG 

X X X. X VJ X. X VJ VJ 


5820 


PPTTAPATPA 

VjVj X X Au A X VjA 


PPATA APPPT 
VjOA X AAV^UVy X 


TPTAGTTGAT 

X VJ X AVJ X X VJA X 


ATTTGAGATG 

AX X XUAUAXw 


CCCTATCATT 

\J\J\J X XX X \JXX X X 


GCAGGGTGAC 

VJVyXXVJ^I^I X ^mI^X^m/ 


5880 


APPPPAPPPT 


TPPPAPAPPT 

X VyVJVyAVJAVJV^ X 


ppATTAAPGA 

VJV^A X X AAV^V/A 


GGGCTTCGGG 

VJUUWX XVd/VJVJVJ 


CAGGCCAAAA 


ACTAGGGCAC 

XX X AA V^ V^ 


5940 


PPTPPTnPPA 


PPPAPTPPPP 

V-/V/V-/ AVJ X v> vJVJ 


nGGAGGAPTC 

VyVJVJ AVJVJ A Vy X v> 


CGGTTCAGGG 

V^UU X X V/AUUU 


AGCGGCCAAC 

\m* Vy VJ VJ V^ Vy AX AX 


TAGCCGAGAA 

X AA VJ V^ VJ AA VJ AM. AX 


6000 

V^ V w 


PPTPAPHTAT 

L/w 1 w Aww 1 A 1 


PPPTPPPAPA 

VJ WW X VJVJwAV^A 


ATATGPiACAT 

A X A X VJVJ AWA X 


CTTTGGGGCG 

V^ X X X VJ VJ VJ VJ V^VJ 


GTCAATCAGC 


CGGGCTCCGG 


6060 

v^ v^ ^# 


ATPTnCPiGPAPi 

A 1 VJVJ V_/VJ Vj Vjil\J 


PTPPiTPAACC 


GGACACGCGG 

VJ VI A W A Vd/U wVJ VJ 


ACTATTCTGC 

iX \J X. X X \J X VJl V/ 


AACGAGCGAG 


ACATACCGGC 


6120 


nPPP APiPiAAA 

UWWwAVJVJ AA A 


PATTTGCTCA 

w A XXX \J\J X WA 


AGAACGGTGA 

AU AAV/VJU X UA 


GTTTCTATTC 

VJ XXX \J X XX X X \J 


GCAGTCGGCT 


GATCTGTGTG 


6180 


AAATCTTAAT 


AAAGGGTCCA 


ATTACCAATT 


TGAAACTCAG 


Fn 

TTTGCGGCGT 


GGCCTATCCG 


6240 


GGCGAACTTT 


TGGGCGTGAT 


GGGCAGTTCC 


GGTGCCGGAA 


AGACGACCCT 


GCTGAATGCC 


6300 


CTTGCCTTTC 


GATCGCCGCA 


GGGCATCCAA 


GTATCGCCAT 


CCGGGATGCG 


ACTGGTCAAT 


6360 


GGCCAACCTG 


TGGACGCCAA 


GGAGATGCAG 


GCCAGGTGCG 


CCTATGTCCA 


GGAGGATGAG 


6420 


CTGTTTATCG 


GCTCCCTAAC 


GGCCAGGGAA 


GACCTGATTT 


TCCAGGCCAT 


GGTGGGGATG 


6480 
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CCACGACATC 


TGACCTATCG 


GGAGCGAGTG 


GCCCGGGTGG 


ATCAGGTGAT 


GCAGGAGCTT 


6540 


TCGCTCAGCA 


AATGTGAGCA 


GACGATGATC 


GGTGTGCCGG 


GCAGGGTGAA 


AGGTGTGTCG 


6600 


GGCGGAGAAA 


GGAAGCGTCT 


GGCATTGGCC 


TCCGAGGGAG 


TAACGGATCG 


GCGGGTTGTG 


6660 


ATGTGCGATG 


AGCGCAGGTC 


CGGACTGGAC 


TCATTTACCG 


GCGAGAGGGT 


GGTGGAGGTG 


6720 


CTGAAGAAGC 


TGTGGCAGAA 


GGGGAAGACC 


GTCATGGTGA 


GCATTGATCA 


GCGGTCnCC 


6780 


GAGCTGTTTG 


AGCTCTTTGA 


CAAGATCGTT 


CTGATGGCGG 


AGGGGAGGGT 


AGGTTTGTTG 


6840 


GGCACTCCCA 


GCGAAGCCGT 


CGAGTTGTTT 


TCCTAGTGAG 


TTGGATGTGT 


TTATTAAGGG 


6900 


TATCTAGCAT 


TACATTACAT 


CTGAAGTGCT 


ATCGAGGGTG 


GGTGCCGAGT 


GTGGTAGCAA 


6960 


CTACAATGCG 


GGGGACTTTT 


AGGTAGAGGT 


GTTGGGGGTT 


GTGGGCGGAC 


GGGAGATGGA 


7020 


GTCCCGTGAT 


CGGATCGCCA 


AGATATGGGA 


CAATTTTGCT 


ATTAGGAAAG 


TAGCCCGGGA 


7080 


TATGGAGCAG 


TTGTTGGGCA 


CCAAAAATTT 


GGAGAAGCGA 


GTGGAGGAGG 


GGGAGAATGG 


7140 


GTACACGTAC 


AAGGCCAGCT 


GGTTCATGCA 


GTTGCGGGCG 


GTGGTGTGGC 


GATGGTGGCT 


7200 


GTCGGTGCTC 


AAGGAACGAG 


TCCTCGTAAA 


AGTGCGACTT 


ATTCAGAGAA 


CGGTGAGTGG 


7260 


TTCCAGTGGA 


AACAAATGAT 


ATAACGGTTA 


CAATTCTTGG 


AAAGAAATTG 


GCTAGATTTT 


7320 


AGTTAGAATT 


GCCTGATTCC 


ACACCCTTCT 


TAGTTTTTTT 


GAATGAGATG 


TATAGTTTAT 


7380 


AGTTTTGCAG 


AAAATAAATA 


AATTTCATTT 


AACTCGGGAA 


CATGTTGAAG 


ATATGAATAT 


7440 


TAATGAGATG 


CGAGTAACAT 


TTTAATTTGC 


AGATGGTTGC 


CATGTTGATT 


GGCCTGATCT 


7500 


TTTTGGGCCA 


ACAACTCACG 


CAAGTGGGGG 


TGATGAATAT 


CAACGGAGCC 


ATGTTCGTCT 


7560 


TCCTGACCAA 


CATGACCTTT 


CAAAACGTGT 


TTGCCAGGAT 


AAATGTAAGT 


GTTGTTTAGA 


7620 


ATACATTTGC 


ATATTAATAA 


TTTAGTAACT 


TTCTAATGAA 


TCGATTGGAT 


TTAGGTGTTG 


7680 


ACCTGAGAGC 


TGCCAGTTTT 


TATGAGGGAG 


GCCGGAAGTG 


GAGTTTATGG 


GTGTGACACA 


7740 


TACTTTCTGG 


GCAAAACGAT 


TGCCGAATTA 


CCGGTTTTTG 


TGACAGTGGC 


AGTGGTCTTC 


7800 


ACGGCGATTG 


GGTATCCGAT 


GATCGGACTG 


CGGGCGGGAG 


TGGTGCAGTT 


GTTGAAGTGG 


7860 


CTGGCGCTGG 


TGAGTCTGGT 


GGCGAATGTG 


TCAACGTCGT 


TGGGATATGT 


AATATCCTGC 


7920 


GCCAGCTGCT 


CGACGTCGAT 


GGCGCTGTGT 


GTGGGTGCGG 


GGGTTATGAT 


AGGATTCGTG 


7980 


CTGTTTGGCG 


GCTTCTTCTT 


GAACTCGGGG 


TGGGTGGCAG 


TATACCTGAA 


ATGGTTGTCG 


8040 


TACCTCTCAT 


GGTTCCGTTA 


CGCGAAGGAG 


GGTGTGGTGA 


TTAACCAATG 


GGCGGAGGTG 


8100 


GAGGGGGGCG 


AAATTAGCTG 


CACATGGTGG 


AACACGACGT 


GGGGCAGTTC 


GGGGAAGGTG 


8160 


ATGCTGGAGA 


CGGTTAACTT 


GTCCGCGGGC 


GATGTGGCGG 


TGGACTAGGT 


GGGTCTGGCG 


8220 



mU^^ 11-3048126 



j{f 10—141952 



ATTCTCATCG 


TGAGCTTCCG 


GGTGCTCGCA 


TATGTGGCTC 


TAAGAGTTGG 


GGCGGGAGGG 


8280 


AAGGAGTAGA 


AGGTAAGTAG 


CGGCCGCACG 


TAAGGGTTAA 


TGTTTTGAAA 


AAAAAATTCG 


8340 


TCCGCACACA 


ACCTTTCCTC 


TCAACAAGCA 


AAGGTGGACT 


GAATTTAAGT 


GTATACTTGG 


8400 


GTAAGCTTCG 


GCTATCGACG 


GGACCACCTT 


ATGTTATTTC 


ATGATGGGGC 


AGAGCGACGT 


8460 


AGTCCAGCGG 


CAGATCGGCG 


GCGGAGAAGT 


TAAGGGTCTC 


CAGGATGAGC 


TTGGGGGAAG 


8520 


TGGGGCACGT 


GGTGTTCGAC 


GATGTGCAGC 


TAATTTGGGC 


CGGGTGCACG 


TGCGGGGATT 


8580 


GGTTAATCAG 


CAGACCCTCG 


TTGGCGTAAC 


GGAAGCATGA 


GAGGTAGGAG 


AACGATTTGA 


8640 


GGTATACTGG 


CACCGAGCCC 


GAGTTCAAGA 


AGAAGGGGTT 


TTTCGATAGG 


CTCGGGCGCC 


8700 


CTGACGAGCA 


TCACAAAAAT 


CGACGCTCAA 


GTCAGAGGTG 


GGGAAAGGCG 


AGAGGACTAT 


8760 


AAAGATACCA 


GGCGTTTCCC 


CCTGGAAGCT 


CCCTCGTGCG 


CTGTGGTGTT 


GGGAGGCTGG 


8820 


CGCTTACCGG 


ATACCTGTCC 


GCCTTTCTCC 


CTTGGGGAAG 


CGTGGCGCTT 


TGTCAATGCT 


8880 


CACGCTGTAG 


GTATCTCAGT 


TGGGTGTAGG 


TGGTTCGCTC 


CAAGGTGGGG 


TGTGTGGACG 


8940 


AACCCCCCGT 


TCAGCCCGAC 


CGCTGCGCCT 


TATGGGGTAA 


CTATGGTGTT 


GAGTGGAACC 


9000 


CGGTAAGACA 


CGACTTATCG 


CGACTGGCAG 


CAGGGAGTGG 


TAACAGGATT 


AGCAGAGGGA 


9060 


GGTATGTAGG 


CGGTGCTACA 


GAGTTCTTGA 


AGTGGTGGGC 


TAACTAGGGG 


TACACTAGAA 


9120 


GGACAGTATT 


TGGTATCTGC 


GCTCTGCTGA 


AGCGAGTTAC 


CTTGGGAAAA 


AGAGTTGGTA 


9180 


GCTCTTGATG 


CGGCAAACAA 


ACCAGGGCTG 


GTAGGGGTGG 


TTTTTTTGTT 


TGCAAGGAGG 


9240 


AGATTACGCG 


CAGAAAAAAA 


GGATCTCAAG 


AAGATCGTTT 


GATCTTTTCT 


AGGGGGTGTG 


9300 


ACGCTCAGTG 


GAACGAAAAC 


TCACGTTAAG 


GGATTTTGGT 


CATGAGATTA 


TGAAAAAGGA 


9360 


TCTTCACCTA 


GATCCTTTTA 


AATTAAAAAT 


GAAGTHTAA 


ATGAATGTAA 


AGTATATATG 


9420 


AGTAAACTTG 


GTCTGACAGT 


TAGGAATGCT 


TAATCAGTGA 


GGGAGGTATG 


TGAGCGATGT 


9480 


GTCTATTTCG 


TTCATCCATA 


GTTGCCTGAC 


TGGGCGTGGT 


GTAGATAACT 


AGGATACGGG 


9540 


AGGGCTTACC 


ATCTGGCCCC 


AGTGCTGCAA 


TGATACCGGG 


AGACCGACGC 


TGAGGGGGTG 


9600 


CAGATTTATC 


AGCAATAAAC 


CAGGCAGCCG 


GAAGGGGGGA 


GCGGAGAAGT 


GGTGGTGGAA 


9660 


CTTTATCCGG 


CTCCATCGAG 


TCTATTAATT 


GTTGGCGGGA 


AGGTAGAGTA 


AGTAGTTGGC 


9720 


CAGTTAATAG 


TTTGCGCAAC 


GTTGTTGCCA 


TTGGTACAGG 


CATGGTGGTG 


TGAGGGTGGT 


9780 


GGTTTGGTAT 


GGCTTCATTC 


AGGTCCGGTT 


CGGAACGATC 


AAGGGGAGTT 


AGATGATGCC 


9840 


CCATGTTGTG 


CAAAAAAGCG 


GTTAGCTCCT 


TGGGTCGTGC 


GATCGTTGTC 


AGAAGTAAGT 


9900 


TGGGCGCAGT 


GTTATCACTC 


ATGGTTATGG 


CAGCACTGGA 


TAATTCTCTT 


AGTGTGATGC 


9960 



2 0 
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CATCCGTAAG 


ATGCTTTTCT 


GTGACTGGTG 


AGTACTCAAC 


CAAGTCATTC TGAGAATAGT 


1 /\ /X O A 

10020 


GTATGCGGCG 


ACGGAGTTGC 


TCTTGCCCGG 


CGTCAACACG 


GGATAATACC 


GCGCGACATA 


■i r\ f\ f\ 

10080 


GCAGAACTTT 


AAAAGTGCTC 


ATCATTGGAA 


AACGTTCTTC 


GGGGCGAAAA 


CTCTGAAGGA 


10140 


TCTTACCGCT 


GTTGAGATCC 


AGTTCGATGT 


AACCCACTCG 


TGCACCCAAC TGATCTTCAG 


10200 


CATCTTTTAC 


TTTCACCAGC 


GTTTCTGGGT 


GAGCAAAAAC 


AGGAAGGCAA 


AATGCCGCAA 


10260 


AAAAGGGAAT 


AAGGGCGACA 


CGGAAATGTT 


GAATACTCAT 


ACTCTTCCTT TTTCAATATT 


10320 


ATTGAAGCAT 


TTATCAGGGT 


TATTGTCTCA 


TGAGCGGATA 


P A T A TTTP A A 


TPT A TTT AHA 


10380 


AAAATAAACA 


AATAGGGGTT 


GCGCGCACAT 


TTCCCCGAAA 


A r»TPPP A CCT 


r APPTPTA Af: 
b Abu IKjl iVAb 


10440 


AAACCATTAT 


TATCATGACA 


TTAACCTATA 


AAAATAGGCG 


T A Tr A rc hcr 

1 AlUAUbAub 


PPPTTTPflTr 
bbb 111 \J\J 1 \J 


10500 


TCGCGCGTTT 


CGGTGATGAC 


GGTGAAAACC 


TCTGACACAT 


UuAbU 1 UOUb 


b Ab AbUb 1 bn 


10560 


CAGCTTGTCT 


GTAAGGGGAT 


GCCGGGAGCA 


GACAAGCCCG 


TP Kcccrccr 

lL»Abubbbbb 


TPAnpnnpTP 


10620 


TTGGCGGGTG 


TCGGGGCTGG 


CTTAACTATG 


CGGCATCAGA 


GCAGAHGTA 


CTGAGAGTGC 


10680 


ACCATATGCG 


GTGTGAAATA 


CCGCAGCGAA 


TCGCGCGGAA 


CTAACGACAG 


TCGCTCCAAG 


10740 


GTCGTCGAAC 


AAAAGGTGAA 


TGTGTTGCGG 


AGAGCGGGTG 


GGAGACAGCG 


AAAGAGCAAC 


10800 


TACGAAACGT 


GGTGTGGTGG 


AGGTGAATTA 


TGAAGAGGGC 


GCGGGATTTG 


AAAAGTATGT 


10860 


ATATAAAAAA 


TATATCCCGG 


TGTTTTATGT 


AGCGATAAAC 


GAGTTTTTGA 


TGTAAGGTAT 


10920 


GCAGGTGTGT 


AAGTCTTTTG 


GTTAGAAGAC 


AAATCCAAAG 


TCTACTTGTG 


GGGATGTTCG 


10980 


AAGGGGAAAT 


ACTTGTATTC 


TATAGGTCAT 


ATCTTGTTTT 


TATTGGCACA 


AATATAATTA 


11040 


CATTAGCTTT 


TTGAGGGGGC 


AATAAACAGT 


AAACACGATG 


GTAATAATGG 


TAAAAAAAAA 


11100 


AACAAGCAGT 


TATTTCGGAT 


ATATGTCGGC 


TACTCCTTGC 


GTCGGGCCCG 


AAGTCTTAGA 


11160 


GCCAGATATG 


CGAGCACCCG 


GAAGCTCACG 


ATGAGAATGG 


CCAGAC 




11206 
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